Skip to content
项目
群组
代码片段
帮助
当前项目
正在载入...
登录 / 注册
切换导航面板
S
S2T
概览
Overview
Details
Activity
Cycle Analytics
版本库
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
问题
0
Issues
0
列表
Board
标记
里程碑
合并请求
0
Merge Requests
0
CI / CD
CI / CD
流水线
作业
日程表
图表
维基
Wiki
代码片段
Snippets
成员
Collapse sidebar
Close sidebar
活动
图像
聊天
创建新问题
作业
提交
Issue Boards
Open sidebar
xuchen
S2T
Commits
3f269efe
Commit
3f269efe
authored
Sep 28, 2023
by
xuchen
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
yaml
parent
504e81af
隐藏空白字符变更
内嵌
并排
正在显示
20 个修改的文件
包含
431 行增加
和
22 行删除
+431
-22
egs/aishell/asr/conf/ipa.yaml
+0
-0
egs/aishell/asr/conf/local_attn.yaml
+0
-5
egs/librispeech/asr/conf/100h.yaml
+3
-0
egs/librispeech/asr/conf/basis.yaml
+1
-0
egs/librispeech/asr/conf/ipa.yaml
+0
-0
egs/librispeech/asr/conf/purectc.yaml
+2
-0
egs/librispeech/asr/conf/purectc_inter.yaml
+11
-1
egs/librispeech/asr/conf/reproduction_bil_ctc_syn.yaml
+69
-0
egs/librispeech/asr/conf/reproduction_encdec_aipa_kd.yaml
+79
-0
egs/librispeech/asr/conf/reproduction_encdec_aipa_kd_woiploss.yaml
+79
-0
egs/librispeech/asr/conf/reproduction_purectc_aipa_kd.yaml
+73
-0
egs/librispeech/asr/conf/reproduction_purectc_aipa_kd_woiploss.yaml
+73
-0
egs/librispeech/asr/run.sh
+12
-6
egs/mustc/asr/conf/ipa.yaml
+0
-0
egs/mustc/st/conf/aipa_kd.yaml
+6
-5
egs/mustc/st/conf/ipa.yaml
+0
-0
egs/mustc/st/conf/reproduction_ctc_aug.yaml
+1
-0
egs/mustc/st/conf/reproduction_nast.yaml
+2
-0
fairseq/tasks/speech_to_text.py
+13
-5
setup.py
+7
-0
没有找到文件。
egs/aishell/asr/conf/
mixup
.yaml
→
egs/aishell/asr/conf/
ipa
.yaml
查看文件 @
3f269efe
File moved
egs/aishell/asr/conf/local_attn.yaml
deleted
100644 → 0
查看文件 @
504e81af
encoder-attention-type
:
local
hard-mask-window
:
0
gauss-mask-sigma
:
3
init-mask-weight
:
0
\ No newline at end of file
egs/librispeech/asr/conf/100h.yaml
0 → 100644
查看文件 @
3f269efe
train-subset
:
train-clean-100
lr
:
0.001
\ No newline at end of file
egs/librispeech/asr/conf/basis.yaml
查看文件 @
3f269efe
...
...
@@ -5,6 +5,7 @@ max-epoch: 300
max-update
:
300000
patience
:
20
post-process
:
sentencepiece
weight-decay
:
1e-4
# best-checkpoint-metric: loss
# maximize-best-checkpoint-metric: False
...
...
egs/librispeech/asr/conf/
mixup
.yaml
→
egs/librispeech/asr/conf/
ipa
.yaml
查看文件 @
3f269efe
File moved
egs/librispeech/asr/conf/purectc.yaml
查看文件 @
3f269efe
arch
:
s2t_ctc
encoder-type
:
transformer
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
...
...
egs/librispeech/asr/conf/purectc_
base
.yaml
→
egs/librispeech/asr/conf/purectc_
inter
.yaml
查看文件 @
3f269efe
...
...
@@ -6,12 +6,16 @@ clip-norm: 10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
0.002
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
ctc
zero_infinity
:
True
ctc-weight
:
1.0
encoder-normalize-before
:
True
decoder-normalize-before
:
True
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
...
...
@@ -26,3 +30,8 @@ encoder-embed-dim: 256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
18
encoder-attention-heads
:
4
# InterCTC
inter-ctc-weight
:
1.0
inter-ctc-layers
:
6,9,12,15
share-inter-ctc
:
True
\ No newline at end of file
egs/librispeech/asr/conf/reproduction_bil_ctc_syn.yaml
0 → 100644
查看文件 @
3f269efe
arch
:
s2t_transformer_s
share-decoder-input-output-embed
:
True
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
weight-decay
:
1e-4
criterion
:
label_smoothed_cross_entropy_with_ctc
label_smoothing
:
0.1
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
subsampling-kernel
:
5
subsampling-stride
:
2
subsampling-norm
:
none
subsampling-activation
:
glu
dropout
:
0.1
activation-fn
:
relu
encoder-embed-dim
:
256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
18
decoder-layers
:
6
encoder-attention-heads
:
4
decoder-embed-dim
:
256
decoder-ffn-embed-dim
:
2048
decoder-attention-heads
:
4
attention-dropout
:
0.1
activation-dropout
:
0.1
#load-pretrained-encoder-from:
#load-pretrained-decoder-from:
# Conformer
macaron-style
:
True
use-cnn-module
:
True
cnn-module-kernel
:
15
encoder-attention-type
:
rel_pos
encoder-activation-fn
:
swish
# Bilingual CTC
share-ctc-and-embed
:
True
share-xctc-and-embed
:
True
ctc-weight
:
0.05
xctc-weight
:
0.2
# InterCTC
inter-ctc-weight
:
0.025
inter-ctc-layers
:
6,9,12,15
share-inter-ctc
:
True
inter-xctc-weight
:
0.1
inter-xctc-layers
:
6,9,12,15
# Prediction-aware encoding
ctc-pae
:
inter_league
xctc-pae
:
inter_league
pae-unnorm-input
:
True
# Curriculum learning mixing
xctc-pae-ground-truth-ratio
:
0.1
xctc-pae-ground-truth-only-mistake
:
True
pae-oracle-smooth
:
True
\ No newline at end of file
egs/librispeech/asr/conf/reproduction_encdec_aipa_kd.yaml
0 → 100644
查看文件 @
3f269efe
arch
:
s2t_transformer_s
share-decoder-input-output-embed
:
True
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
label_smoothed_cross_entropy_with_ctc
label_smoothing
:
0.1
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
subsampling-kernel
:
5
subsampling-stride
:
2
subsampling-norm
:
none
subsampling-activation
:
glu
dropout
:
0.1
activation-fn
:
relu
encoder-embed-dim
:
256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
12
decoder-layers
:
6
encoder-attention-heads
:
4
decoder-embed-dim
:
256
decoder-ffn-embed-dim
:
2048
decoder-attention-heads
:
4
attention-dropout
:
0.1
activation-dropout
:
0.1
#load-pretrained-encoder-from:
#load-pretrained-decoder-from:
# Append-based Interpolation Augmentation
inter-mixup
:
True
inter-mixup-layer
:
-1
inter-mixup-decoder-layer
:
0
inter-mixup-prob
:
1.0
inter-mixup-ratio
:
1.0
inter-mixup-beta
:
0.2
inter-mixup-keep-org
:
True
inter-mixup-decoder-emb
:
True
cal-mixup-loss
:
True
no-specaugment
:
False
layer-out-norm
:
False
inter-mixup-ratio-decay
:
False
inter-mixup-ratio-decay-params
:
20000,40000,0
# MTL
ctc-weight
:
0.3
inter-ctc-weight
:
0.2
inter-ctc-layers
:
6,9
share-inter-ctc
:
True
share-ctc-and-embed
:
True
ctc-pae
:
inter_league
pae-unnorm-input
:
True
ctc-mixup-consistent-weight
:
0.15
inter-ctc-mixup-consistent-weight
:
0.1
mixup-consistent-weight
:
0.5
# Conformer
macaron-style
:
True
use-cnn-module
:
True
cnn-module-kernel
:
15
encoder-attention-type
:
rel_pos
encoder-activation-fn
:
swish
layer-padding-mask
:
True
\ No newline at end of file
egs/librispeech/asr/conf/reproduction_encdec_aipa_kd_woiploss.yaml
0 → 100644
查看文件 @
3f269efe
arch
:
s2t_transformer_s
share-decoder-input-output-embed
:
True
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
label_smoothed_cross_entropy_with_ctc
label_smoothing
:
0.1
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
subsampling-kernel
:
5
subsampling-stride
:
2
subsampling-norm
:
none
subsampling-activation
:
glu
dropout
:
0.1
activation-fn
:
relu
encoder-embed-dim
:
256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
12
decoder-layers
:
6
encoder-attention-heads
:
4
decoder-embed-dim
:
256
decoder-ffn-embed-dim
:
2048
decoder-attention-heads
:
4
attention-dropout
:
0.1
activation-dropout
:
0.1
#load-pretrained-encoder-from:
#load-pretrained-decoder-from:
# Append-based Interpolation Augmentation
inter-mixup
:
True
inter-mixup-layer
:
-1
inter-mixup-decoder-layer
:
0
inter-mixup-prob
:
1.0
inter-mixup-ratio
:
1.0
inter-mixup-beta
:
0.2
inter-mixup-keep-org
:
True
inter-mixup-decoder-emb
:
True
cal-mixup-loss
:
False
no-specaugment
:
False
layer-out-norm
:
False
inter-mixup-ratio-decay
:
False
inter-mixup-ratio-decay-params
:
20000,40000,0
# MTL
ctc-weight
:
0.3
inter-ctc-weight
:
0.2
inter-ctc-layers
:
6,9
share-inter-ctc
:
True
share-ctc-and-embed
:
True
ctc-pae
:
inter_league
pae-unnorm-input
:
True
ctc-mixup-consistent-weight
:
0.15
inter-ctc-mixup-consistent-weight
:
0.1
mixup-consistent-weight
:
0.5
# Conformer
macaron-style
:
True
use-cnn-module
:
True
cnn-module-kernel
:
15
encoder-attention-type
:
rel_pos
encoder-activation-fn
:
swish
layer-padding-mask
:
True
\ No newline at end of file
egs/librispeech/asr/conf/reproduction_purectc_aipa_kd.yaml
0 → 100644
查看文件 @
3f269efe
arch
:
s2t_ctc
encoder-type
:
transformer
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
ctc
zero_infinity
:
True
ctc-weight
:
1.0
encoder-normalize-before
:
True
decoder-normalize-before
:
True
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
subsampling-kernel
:
5
subsampling-stride
:
2
subsampling-norm
:
none
subsampling-activation
:
glu
dropout
:
0.1
activation-fn
:
relu
encoder-embed-dim
:
256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
18
encoder-attention-heads
:
4
# Append-based Interpolation Augmentation
inter-mixup
:
True
inter-mixup-layer
:
-1
inter-mixup-decoder-layer
:
0
inter-mixup-prob
:
1.0
inter-mixup-ratio
:
1.0
inter-mixup-beta
:
0.2
inter-mixup-keep-org
:
True
inter-mixup-decoder-emb
:
True
cal-mixup-loss
:
True
no-specaugment
:
False
layer-out-norm
:
False
inter-mixup-ratio-decay
:
False
inter-mixup-ratio-decay-params
:
20000,40000,0
# MTL
inter-ctc-weight
:
1.0
inter-ctc-layers
:
6,9,12,15
share-inter-ctc
:
True
share-ctc-and-embed
:
True
ctc-pae
:
inter_league
pae-unnorm-input
:
True
ctc-mixup-consistent-weight
:
0.15
inter-ctc-mixup-consistent-weight
:
0.1
mixup-consistent-weight
:
0.5
# Conformer
macaron-style
:
True
use-cnn-module
:
True
cnn-module-kernel
:
15
encoder-attention-type
:
rel_pos
encoder-activation-fn
:
swish
layer-padding-mask
:
True
\ No newline at end of file
egs/librispeech/asr/conf/reproduction_purectc_aipa_kd_woiploss.yaml
0 → 100644
查看文件 @
3f269efe
arch
:
s2t_ctc
encoder-type
:
transformer
optimizer
:
adam
clip-norm
:
10.0
lr-scheduler
:
inverse_sqrt
warmup-init-lr
:
1e-7
warmup-updates
:
10000
lr
:
2e-3
adam_betas
:
(0.9,0.98)
criterion
:
ctc
zero_infinity
:
True
ctc-weight
:
1.0
encoder-normalize-before
:
True
decoder-normalize-before
:
True
subsampling-type
:
conv1d
subsampling-layers
:
2
subsampling-filter
:
1024
subsampling-kernel
:
5
subsampling-stride
:
2
subsampling-norm
:
none
subsampling-activation
:
glu
dropout
:
0.1
activation-fn
:
relu
encoder-embed-dim
:
256
encoder-ffn-embed-dim
:
2048
encoder-layers
:
18
encoder-attention-heads
:
4
# Append-based Interpolation Augmentation
inter-mixup
:
True
inter-mixup-layer
:
-1
inter-mixup-decoder-layer
:
0
inter-mixup-prob
:
1.0
inter-mixup-ratio
:
1.0
inter-mixup-beta
:
0.2
inter-mixup-keep-org
:
True
inter-mixup-decoder-emb
:
True
cal-mixup-loss
:
False
no-specaugment
:
False
layer-out-norm
:
False
inter-mixup-ratio-decay
:
False
inter-mixup-ratio-decay-params
:
20000,40000,0
# MTL
inter-ctc-weight
:
1.0
inter-ctc-layers
:
6,9,12,15
share-inter-ctc
:
True
share-ctc-and-embed
:
True
ctc-pae
:
inter_league
pae-unnorm-input
:
True
ctc-mixup-consistent-weight
:
0.15
inter-ctc-mixup-consistent-weight
:
0.1
mixup-consistent-weight
:
0.5
# Conformer
macaron-style
:
True
use-cnn-module
:
True
cnn-module-kernel
:
15
encoder-attention-type
:
rel_pos
encoder-activation-fn
:
swish
layer-padding-mask
:
True
\ No newline at end of file
egs/librispeech/asr/run.sh
查看文件 @
3f269efe
...
...
@@ -192,11 +192,9 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
cp
-f
${
pwd_dir
}
/
`
basename
${
BASH_SOURCE
[0]
}
`
${
model_dir
}
cp
-f
${
pwd_dir
}
/train.sh
${
model_dir
}
extra_parameter
=
"
${
extra_parameter
}
--train-config
${
pwd_dir
}
/conf/basis.yaml"
cp
-f
${
pwd_dir
}
/conf/basis.yaml
${
model_dir
}
train_config
=
${
train_config
}
,basis
config_list
=
"
${
train_config
//,/
}
"
idx
=
1
idx
=
0
for
config
in
${
config_list
[@]
}
do
config_path
=
${
pwd_dir
}
/conf/
${
config
}
.yaml
...
...
@@ -206,10 +204,18 @@ if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
fi
cp
-f
${
config_path
}
${
model_dir
}
extra_parameter
=
"
${
extra_parameter
}
--train-config
${
idx
}
${
config_path
}
"
if
[[
$idx
-eq
0
]]
;
then
extra_parameter
=
"
${
extra_parameter
}
--train-config
${
config_path
}
"
else
extra_parameter
=
"
${
extra_parameter
}
--train-config
${
idx
}
${
config_path
}
"
fi
idx
=
$((
idx
+
1
))
done
#extra_parameter="${extra_parameter}
# --train-config${idx} ${pwd_dir}/conf/basis.yaml"
#cp -f ${pwd_dir}/conf/basis.yaml ${model_dir}
cmd
=
"python3 -u
${
code_dir
}
/fairseq_cli/train.py
${
data_dir
}
...
...
egs/mustc/asr/conf/
mixup
.yaml
→
egs/mustc/asr/conf/
ipa
.yaml
查看文件 @
3f269efe
File moved
egs/mustc/st/conf/aipa_kd.yaml
查看文件 @
3f269efe
# Append-based Interpolation Augmentation
inter-mixup
:
True
inter-mixup-layer
:
-1
...
...
@@ -6,12 +7,12 @@ inter-mixup-prob: 1.0
inter-mixup-ratio
:
1.0
inter-mixup-beta
:
0.2
inter-mixup-keep-org
:
Fals
e
inter-mixup-decoder-emb
:
Fals
e
inter-mixup-keep-org
:
Tru
e
inter-mixup-decoder-emb
:
Tru
e
ctc-mixup-consistent-weight
:
0
inter-ctc-mixup-consistent-weight
:
0
mixup-consistent-weight
:
0
ctc-mixup-consistent-weight
:
0
.15
inter-ctc-mixup-consistent-weight
:
0
.1
mixup-consistent-weight
:
0
.5
cal-mixup-loss
:
True
no-specaugment
:
False
...
...
egs/mustc/st/conf/
mixup
.yaml
→
egs/mustc/st/conf/
ipa
.yaml
查看文件 @
3f269efe
File moved
egs/mustc/st/conf/reproduction_ctc_aug.yaml
查看文件 @
3f269efe
...
...
@@ -67,6 +67,7 @@ inter-xctc-layers: 4
# Prediction-aware encoding
ctc-pae
:
inter_league
xctc-pae
:
inter_league
pae-unnorm-input
:
True
# Cross-layer attn
xctc-cross-attn
:
True
...
...
egs/mustc/st/conf/reproduction_nast.yaml
查看文件 @
3f269efe
...
...
@@ -5,6 +5,8 @@ criterion: ctc
zero_infinity
:
True
xctc-weight
:
1.0
ctc-weight
:
1.0
share-ctc-and-embed
:
True
share-xctc-and-embed
:
True
share-decoder-input-output-embed
:
True
optimizer
:
adam
...
...
fairseq/tasks/speech_to_text.py
查看文件 @
3f269efe
...
...
@@ -469,14 +469,22 @@ class SpeechToTextTask(LegacyFairseqTask):
def
compute_bleu
(
meters
):
import
inspect
import
sacrebleu
try
:
from
sacrebleu.metrics
import
BLEU
fn_sig
=
inspect
.
getfullargspec
(
sacrebleu
.
compute_bleu
)[
0
]
comp_bleu
=
BLEU
.
compute_bleu
except
ImportError
:
# compatibility API for sacrebleu 1.x
import
sacrebleu
comp_bleu
=
sacrebleu
.
compute_bleu
fn_sig
=
inspect
.
getfullargspec
(
comp_bleu
)[
0
]
if
"smooth_method"
in
fn_sig
:
smooth
=
{
"smooth_method"
:
"exp"
}
else
:
smooth
=
{
"smooth"
:
"exp"
}
bleu
=
sacrebleu
.
compute
_bleu
(
bleu
=
comp
_bleu
(
correct
=
meters
[
"_bleu_counts"
]
.
sum
,
total
=
meters
[
"_bleu_totals"
]
.
sum
,
sys_len
=
meters
[
"_bleu_sys_len"
]
.
sum
,
...
...
@@ -486,8 +494,8 @@ class SpeechToTextTask(LegacyFairseqTask):
return
round
(
bleu
.
score
,
2
)
metrics
.
log_derived
(
"bleu"
,
compute_bleu
)
else
:
metrics
.
log_scalar
(
"bleu"
,
0
)
#
else:
#
metrics.log_scalar("bleu", 0)
def
build_generator
(
self
,
...
...
setup.py
查看文件 @
3f269efe
...
...
@@ -198,6 +198,13 @@ def do_setup(package_data):
"sacrebleu>=1.4.12"
,
"torch"
,
"tqdm"
,
"configargparse"
,
"matplotlib"
,
"scikit-learn"
,
"editdistance"
,
"espnet"
,
"torchaudio"
,
"pandas"
,
],
dependency_links
=
dependency_links
,
packages
=
find_packages
(
...
...
编写
预览
Markdown
格式
0%
重试
或
添加新文件
添加附件
取消
您添加了
0
人
到此讨论。请谨慎行事。
请先完成此评论的编辑!
取消
请
注册
或者
登录
后发表评论