Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
F
Fairseq-S2T
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
xuchen
Fairseq-S2T
Commits
4f679c86
Commit
4f679c86
authored
Mar 18, 2022
by
xuchen
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix the bugs
parent
b970c7df
显示空白字符变更
内嵌
并排
正在显示
4 个修改的文件
包含
39 行增加
和
37 行删除
+39
-37
egs/libri_trans/asr/conf/debug.yaml
+7
-29
egs/mustc/asr/conf/mixup.yaml
+4
-0
fairseq/models/speech_to_text/pdss2t_transformer.py
+27
-7
fairseq/models/speech_to_text/s2t_sate.py
+1
-1
没有找到文件。
egs/libri_trans/asr/conf/debug.yaml
查看文件 @
4f679c86
arch
:
s2t_dual
asr-encoder
:
sate
mt-encoder-layers
:
6
mt-encoder
:
transformer
encoder-drop-net
:
True
encoder-drop-net-prob
:
0.8
encoder-embed-dim
:
256
pds-stages
:
4
#ctc-layer: 12
pds-layers
:
3_3_3_3
pds-ratios
:
2_2_1_2
arch
:
pdss2t_transformer_s_8
pds-fusion
:
True
pds-fusion-method
:
all_conv
pds-embed-dims
:
256_256_256_256
pds-ds-method
:
conv
pds-embed-norm
:
True
pds-position-embed
:
1_1_1_1
pds-kernel-sizes
:
5_5_5_5
pds-ffn-ratios
:
8_8_8_8
pds-attn-heads
:
4_4_4_4
ctc-layer
:
12
inter_mixup
:
True
inter_mixup_layer
:
0
inter_mixup_ratio
:
0.2
share-decoder-input-output-embed
:
True
optimizer
:
adam
...
...
@@ -30,8 +15,7 @@ warmup-updates: 10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
join_speech_and_text_loss
ctc-weight
:
0.3
criterion
:
label_smoothed_cross_entropy_with_ctc
label_smoothing
:
0.1
dropout
:
0.1
...
...
@@ -44,8 +28,3 @@ encoder-attention-heads: 4
decoder-embed-dim
:
256
decoder-ffn-embed-dim
:
2048
decoder-attention-heads
:
4
#load-pretrained-encoder-from:
#load-pretrained-asr-encoder-from: /home/xuchen/st/checkpoints/mustc/asr/0225_st_purectc_pds_base_8_baseline_topctc/avg_10_checkpoint.pt
#load-pretrained-mt-encoder-from: /home/xuchen/st/checkpoints/mustc/mt/0223_st_small_baseline/avg_10_checkpoint.pt
#load-pretrained-decoder-from: /home/xuchen/st/checkpoints/mustc/mt/0223_st_small_baseline/avg_10_checkpoint.pt
\ No newline at end of file
egs/mustc/asr/conf/mixup.yaml
0 → 100644
查看文件 @
4f679c86
inter_mixup
:
True
inter_mixup_layer
:
0
inter_mixup_ratio
:
0.2
\ No newline at end of file
fairseq/models/speech_to_text/pdss2t_transformer.py
查看文件 @
4f679c86
...
...
@@ -609,7 +609,13 @@ class PDSS2TTransformerModel(S2TTransformerModel):
"--inter-mixup-prob"
,
default
=
1
,
type
=
float
,
help
=
"the probability to apply mixup"
,
help
=
"the probability for mixup"
,
)
parser
.
add_argument
(
"--inter-mixup-ratio"
,
default
=
1
,
type
=
float
,
help
=
"the ratio for mixup"
,
)
pass
...
...
@@ -905,12 +911,16 @@ class PDSS2TTransformerEncoder(FairseqEncoder):
# mixup
self
.
mixup
=
getattr
(
args
,
"inter_mixup"
,
False
)
if
self
.
mixup
:
self
.
mixup_layer
=
args
.
inter_mixup_layer
self
.
mixup_prob
=
getattr
(
args
,
"inter_mixup_prob"
,
1.0
)
beta
=
args
.
inter_mixup_beta
self
.
mixup_layer
=
int
(
args
.
inter_mixup_layer
)
self
.
mixup_prob
=
float
(
getattr
(
args
,
"inter_mixup_prob"
,
1.0
))
self
.
mixup_ratio
=
float
(
getattr
(
args
,
"inter_mixup_ratio"
,
1.0
))
beta
=
float
(
args
.
inter_mixup_beta
)
from
torch.distributions
import
Beta
self
.
beta
=
Beta
(
torch
.
Tensor
([
beta
]),
torch
.
Tensor
([
beta
]))
logger
.
info
(
"Use mixup in layer
%
d with beta
%
f."
%
(
self
.
mixup_layer
,
beta
))
logger
.
info
(
"Use mixup in layer
%
d with beta
%.2
f, prob
%.2
f, ratio
%.2
f."
%
(
self
.
mixup_layer
,
beta
,
self
.
mixup_prob
,
self
.
mixup_ratio
)
)
# gather cosine similarity
self
.
gather_cos_sim
=
getattr
(
args
,
"gather_cos_sim"
,
False
)
...
...
@@ -938,10 +948,20 @@ class PDSS2TTransformerEncoder(FairseqEncoder):
def
apply_mixup
(
self
,
x
,
encoder_padding_mask
):
batch
=
x
.
size
(
1
)
indices
=
np
.
random
.
permutation
(
batch
)
if
self
.
mixup_ratio
==
1
:
if
len
(
indices
)
%
2
!=
0
:
indices
=
np
.
append
(
indices
,
(
indices
[
-
1
]))
idx1
=
torch
.
from_numpy
(
indices
[
0
::
2
])
.
to
(
x
.
device
)
idx2
=
torch
.
from_numpy
(
indices
[
1
::
2
])
.
to
(
x
.
device
)
idx1
=
indices
[
0
::
2
]
idx2
=
indices
[
1
::
2
]
else
:
mix_size
=
int
(
max
(
2
,
batch
*
self
.
mixup_ratio
//
2
*
2
))
mix_indices
=
indices
[:
mix_size
]
idx1
=
np
.
append
(
mix_indices
[
0
::
2
],
(
indices
[
mix_size
:]))
idx2
=
np
.
append
(
mix_indices
[
1
::
2
],
(
indices
[
mix_size
:]))
idx1
=
torch
.
from_numpy
(
idx1
)
.
to
(
x
.
device
)
idx2
=
torch
.
from_numpy
(
idx2
)
.
to
(
x
.
device
)
x1
=
x
[:,
idx1
]
x2
=
x
[:,
idx2
]
...
...
fairseq/models/speech_to_text/s2t_sate.py
查看文件 @
4f679c86
...
...
@@ -107,7 +107,7 @@ class S2TSATEModel(S2TTransformerModel):
help
=
"ctc layer for target sentence"
,
)
parser
.
add_argument
(
"--target-intermedia-ctc-layer"
,
"--target-intermedia-ctc-layer
s
"
,
default
=
None
,
type
=
str
,
help
=
"intermedia ctc layers for target sentence"
,
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论