Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
F
Fairseq-S2T
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
xuchen
Fairseq-S2T
Commits
f1cf477d
Commit
f1cf477d
authored
Aug 09, 2021
by
xuchen
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
optimize the code
parent
ad1caa72
隐藏空白字符变更
内嵌
并排
正在显示
2 个修改的文件
包含
8 行增加
和
4 行删除
+8
-4
egs/mustc/st/local/monitor.sh
+1
-0
fairseq/modules/relative_multihead_attention.py
+7
-4
没有找到文件。
egs/mustc/st/local/monitor.sh
查看文件 @
f1cf477d
gpu_num
=
1
cmd
=
""
while
:
do
...
...
fairseq/modules/relative_multihead_attention.py
查看文件 @
f1cf477d
...
...
@@ -312,7 +312,8 @@ class RelativeMultiheadAttention(MultiheadAttention):
return
attn
,
attn_weights
def
_generate_relative_positions_matrix
(
self
,
length
,
max_relative_length
,
incremental_state
):
@staticmethod
def
_generate_relative_positions_matrix
(
length
,
max_relative_length
,
incremental_state
):
if
not
incremental_state
:
# training process
range_vec
=
torch
.
arange
(
length
)
...
...
@@ -328,7 +329,8 @@ class RelativeMultiheadAttention(MultiheadAttention):
return
final_mat
def
_relative_attention_inner
(
self
,
x
,
y
,
z
,
transpose
=
True
):
@staticmethod
def
_relative_attention_inner
(
x
,
y
,
z
,
transpose
=
True
):
"""Relative position-aware dot-product attention inner calculation.
This batches matrix multiply calculations to avoid unnecessary broadcasting.
...
...
@@ -337,7 +339,7 @@ class RelativeMultiheadAttention(MultiheadAttention):
x: Tensor with shape [batch_size*heads, length, length or depth].
y: Tensor with shap e [batch_size*heads, length, depth].
z: Tensor with shape [length, length, depth].
transpose: Whether to tranpose inner matrices of y and z. Should be true if
transpose: Whether to tran
s
pose inner matrices of y and z. Should be true if
last dimension of x is depth, not length.
Returns:
...
...
@@ -348,11 +350,12 @@ class RelativeMultiheadAttention(MultiheadAttention):
"""
batch_size_mul_head
=
x
.
size
()[
0
]
length
=
z
.
size
()[
0
]
# print(batch_size_mul_head, length)
# xy_matmul is [batch_size*heads, length, length or depth]
if
transpose
:
y
=
y
.
transpose
(
1
,
2
)
xy_matmul
=
torch
.
bmm
(
x
,
y
)
# x_t is [length, batch_size * heads, length or depth]
x_t
=
x
.
transpose
(
0
,
1
)
# x_tz_matmul is [length, batch_size * heads, length or depth]
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论