Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
W
WMT19-1.0.14
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
Emmay
WMT19-1.0.14
Commits
a440c641
Commit
a440c641
authored
Feb 18, 2019
by
libei
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
add transformer_dla.py to support Dynamic Layer Aggeration training
parent
90010cd3
隐藏空白字符变更
内嵌
并排
正在显示
2 个修改的文件
包含
18 行增加
和
7 行删除
+18
-7
.idea/workspace.xml
+5
-5
tensor2tensor/models/transformer_dla.py
+13
-2
没有找到文件。
.idea/workspace.xml
查看文件 @
a440c641
...
...
@@ -3,7 +3,6 @@
<component
name=
"ChangeListManager"
>
<list
default=
"true"
id=
"7d6d9926-f879-4708-ad8e-442bac96b62a"
name=
"Default"
comment=
""
>
<change
beforePath=
"$PROJECT_DIR$/.idea/workspace.xml"
afterPath=
"$PROJECT_DIR$/.idea/workspace.xml"
/>
<change
beforePath=
"$PROJECT_DIR$/tensor2tensor/models/transformer.py"
afterPath=
"$PROJECT_DIR$/tensor2tensor/models/transformer.py"
/>
<change
beforePath=
"$PROJECT_DIR$/tensor2tensor/models/transformer_dla.py"
afterPath=
"$PROJECT_DIR$/tensor2tensor/models/transformer_dla.py"
/>
</list>
<option
name=
"EXCLUDED_CONVERTED_TO_IGNORED"
value=
"true"
/>
...
...
@@ -66,8 +65,8 @@
<file
leaf-file-name=
"transformer_dla.py"
pinned=
"false"
current-in-tab=
"true"
>
<entry
file=
"file://$PROJECT_DIR$/tensor2tensor/models/transformer_dla.py"
>
<provider
selected=
"true"
editor-type-id=
"text-editor"
>
<state
relative-caret-position=
"
-1723
"
>
<caret
line=
"2
09"
column=
"0"
lean-forward=
"false"
selection-start-line=
"209"
selection-start-column=
"0"
selection-end-line=
"209"
selection-end-column=
"0
"
/>
<state
relative-caret-position=
"
379
"
>
<caret
line=
"2
39"
column=
"5"
lean-forward=
"true"
selection-start-line=
"239"
selection-start-column=
"5"
selection-end-line=
"239"
selection-end-column=
"5
"
/>
<folding>
<element
signature=
"e#738#776#0"
expanded=
"true"
/>
</folding>
...
...
@@ -220,6 +219,7 @@
</component>
<component
name=
"ToolWindowManager"
>
<frame
x=
"-8"
y=
"-8"
width=
"1936"
height=
"1056"
extended-state=
"7"
/>
<editor
active=
"true"
/>
<layout>
<window_info
id=
"TODO"
active=
"false"
anchor=
"bottom"
auto_hide=
"false"
internal_type=
"DOCKED"
type=
"DOCKED"
visible=
"false"
show_stripe_button=
"true"
weight=
"0.33"
sideWeight=
"0.5"
order=
"11"
side_tool=
"false"
content_ui=
"tabs"
/>
<window_info
id=
"Event Log"
active=
"false"
anchor=
"bottom"
auto_hide=
"false"
internal_type=
"DOCKED"
type=
"DOCKED"
visible=
"false"
show_stripe_button=
"true"
weight=
"0.33"
sideWeight=
"0.5"
order=
"0"
side_tool=
"true"
content_ui=
"tabs"
/>
...
...
@@ -422,8 +422,8 @@
</entry>
<entry
file=
"file://$PROJECT_DIR$/tensor2tensor/models/transformer_dla.py"
>
<provider
selected=
"true"
editor-type-id=
"text-editor"
>
<state
relative-caret-position=
"
-1723
"
>
<caret
line=
"2
09"
column=
"0"
lean-forward=
"false"
selection-start-line=
"209"
selection-start-column=
"0"
selection-end-line=
"209"
selection-end-column=
"0
"
/>
<state
relative-caret-position=
"
379
"
>
<caret
line=
"2
39"
column=
"5"
lean-forward=
"true"
selection-start-line=
"239"
selection-start-column=
"5"
selection-end-line=
"239"
selection-end-column=
"5
"
/>
<folding>
<element
signature=
"e#738#776#0"
expanded=
"true"
/>
</folding>
...
...
tensor2tensor/models/transformer_dla.py
查看文件 @
a440c641
...
...
@@ -172,7 +172,6 @@ def transformer_encoder(encoder_input,
encoder_layer
.
add
(
x
)
for
layer
in
xrange
(
hparams
.
encoder_layers
):
with
tf
.
variable_scope
(
"layer_
%
d"
%
layer
):
#self-attention network
residual
=
x
x
=
may_be_layernorm
(
x
,
hparams
,
before
=
True
)
...
...
@@ -205,9 +204,14 @@ def transformer_encoder(encoder_input,
broadcast_dims
=
residual_dropout_broadcast_dims
)
x
=
residual
+
x
x
=
may_be_layernorm
(
x
,
hparams
,
after
=
True
)
# add layer output into the history for dynamic layer aggeration
with
tf
.
variable_scope
(
"layer_history"
):
encoder_layer
.
add
(
x
)
x
=
encoder_layer
.
pop
()
# if use normalize before, it's necessary to normalize the final output
if
hparams
.
normalize_before
:
x
=
may_be_layernorm
(
x
,
hparams
,
before
=
True
,
name
=
"norm_top"
)
return
x
...
...
@@ -246,6 +250,8 @@ def transformer_decoder(decoder_input,
# Summaries don't work in multi-problem setting yet.
summaries
=
"problems"
not
in
hparams
.
values
()
or
len
(
hparams
.
problems
)
==
1
with
tf
.
variable_scope
(
name
):
if
hparams
.
use_emb
:
decoder_layer
.
add
(
x
)
for
layer
in
xrange
(
hparams
.
decoder_layers
):
with
tf
.
variable_scope
(
"layer_
%
d"
%
layer
):
# self-attention network
...
...
@@ -300,6 +306,11 @@ def transformer_decoder(decoder_input,
broadcast_dims
=
residual_dropout_broadcast_dims
)
x
=
residual
+
x
x
=
may_be_layernorm
(
x
,
hparams
,
after
=
True
)
# add layer output into the history for dynamic layer aggeration
with
tf
.
variable_scope
(
"layer_history"
):
decoder_layer
.
add
(
x
)
x
=
decoder_layer
.
pop
()
# if use normalize before, it's necessary to normalize the final output
if
hparams
.
normalize_before
:
x
=
may_be_layernorm
(
x
,
hparams
,
before
=
True
,
name
=
"norm_top"
)
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论