transformer_dla.py
18.8 KB
-
support training deep dla model with relative position presentation by setting … · 66913b25
support training deep dla model with relative position presentation by setting attention_type = "relative_dot_product"
libei committed