| Name |
Last commit
|
Last Update |
|---|---|---|
| .. | ||
| aishell/asr | ||
| iwslt14/mt | ||
| iwslt2022 | ||
| libri_trans | ||
| librispeech/asr | ||
| mustc | ||
| tibetan/asr | ||
| wav2vec | ||
| wmt16/mt | ||
| wmt20/mt |
It must be said that some problems still confuse me: 1. Whether to scale in the input layer (I try to replace it with layer specification); 2. The detailed setting of weight sharing between output projection matrix and embedding matrix in the adapter (I notice that inconsistent variance will lead to bad results); 3. The biggest confusion is that the variance increases with the calculation layer by layer (I am not sure if this phenomenon is reasonable, I will compare the behavior on the latest code). Finally, the detailed implementation is so important to the final performance, even if it is a subtle difference.
| Name |
Last commit
|
Last Update |
|---|---|---|
| .. | ||
| aishell/asr | 正在载入提交数据... | |
| iwslt14/mt | 正在载入提交数据... | |
| iwslt2022 | 正在载入提交数据... | |
| libri_trans | 正在载入提交数据... | |
| librispeech/asr | 正在载入提交数据... | |
| mustc | 正在载入提交数据... | |
| tibetan/asr | 正在载入提交数据... | |
| wav2vec | 正在载入提交数据... | |
| wmt16/mt | 正在载入提交数据... | |
| wmt20/mt | 正在载入提交数据... |