common_attention.py
30.4 KB
-
support training deep dla model with relative position presentation by setting … · 66913b25
support training deep dla model with relative position presentation by setting attention_type = "relative_dot_product"
libei committed