How Many Layers and Why? An Analysis of the Model Depth in Transformers

TitreHow Many Layers and Why? An Analysis of the Model Depth in Transformers
Publication TypeArticle dans des actes
Année de la conférence2021
AuthorsSimoulin, Antoine, and Benoît Crabbé
Nom de la conférenceAssociation of Computational Linguistics (student)
Conference LocationBangkok, Thailand
URLhttps://hal.archives-ouvertes.fr/hal-03601412