Name |
Last commit
|
Last Update |
---|---|---|
.. | ||
base.yaml | ||
basis.yaml | ||
conformer.yaml | ||
ctc.yaml | ||
debug.yaml | ||
dlcl.yaml | ||
local_attn.yaml | ||
pds_base.yaml | ||
pds_base_16.yaml | ||
pds_base_32.yaml | ||
pds_base_8.yaml | ||
purectc.yaml | ||
rpr.yaml |
It must be said that some problems still confuse me: 1. Whether to scale in the input layer (I try to replace it with layer specification); 2. The detailed setting of weight sharing between output projection matrix and embedding matrix in the adapter (I notice that inconsistent variance will lead to bad results); 3. The biggest confusion is that the variance increases with the calculation layer by layer (I am not sure if this phenomenon is reasonable, I will compare the behavior on the latest code). Finally, the detailed implementation is so important to the final performance, even if it is a subtle difference.
Name |
Last commit
|
Last Update |
---|---|---|
.. | ||
base.yaml | 正在载入提交数据... | |
basis.yaml | 正在载入提交数据... | |
conformer.yaml | 正在载入提交数据... | |
ctc.yaml | 正在载入提交数据... | |
debug.yaml | 正在载入提交数据... | |
dlcl.yaml | 正在载入提交数据... | |
local_attn.yaml | 正在载入提交数据... | |
pds_base.yaml | 正在载入提交数据... | |
pds_base_16.yaml | 正在载入提交数据... | |
pds_base_32.yaml | 正在载入提交数据... | |
pds_base_8.yaml | 正在载入提交数据... | |
purectc.yaml | 正在载入提交数据... | |
rpr.yaml | 正在载入提交数据... |