fairseq · 380d77940bf5593c74f1246ef7ca8f7aadb26ea3 · xuchen / Fairseq-S2T

I optimized the implementation of S2T. · 380d7794

It must be said that some problems still confuse me:
1. Whether to scale in the input layer (I try to replace it with layer specification);
2. The detailed setting of weight sharing between output projection matrix and embedding matrix in the adapter (I notice that inconsistent variance will lead to bad results);
3. The biggest confusion is that the variance increases with the calculation layer by layer (I am not sure if this phenomenon is reasonable, I will compare the behavior on the latest code).
Finally, the detailed implementation is so important to the final performance, even if it is a subtle difference.

committed May 24, 2022

380d7794

Name	Last commit	Last Update
..
benchmark		正在载入提交数据...
clib		正在载入提交数据...
config		正在载入提交数据...
criterions		正在载入提交数据...
data		正在载入提交数据...
dataclass		正在载入提交数据...
distributed		正在载入提交数据...
logging		正在载入提交数据...
model_parallel		正在载入提交数据...
models		正在载入提交数据...
modules		正在载入提交数据...
optim		正在载入提交数据...
scoring		正在载入提交数据...
tasks		正在载入提交数据...
__init__.py		正在载入提交数据...
binarizer.py		正在载入提交数据...
checkpoint_utils.py		正在载入提交数据...
file_io.py		正在载入提交数据...
file_utils.py		正在载入提交数据...
hub_utils.py		正在载入提交数据...
incremental_decoding_utils.py		正在载入提交数据...
iterative_refinement_generator.py		正在载入提交数据...
nan_detector.py		正在载入提交数据...
ngram_repeat_block.py		正在载入提交数据...
options.py		正在载入提交数据...
pdb.py		正在载入提交数据...
quantization_utils.py		正在载入提交数据...
registry.py		正在载入提交数据...
search.py		正在载入提交数据...
sequence_generator.py		正在载入提交数据...
sequence_scorer.py		正在载入提交数据...
token_generation_constraints.py		正在载入提交数据...
tokenizer.py		正在载入提交数据...
trainer.py		正在载入提交数据...
utils.py		正在载入提交数据...
version.txt		正在载入提交数据...