Simoulin, Antoine, and Benoît Crabbé. How Many Layers and Why? An Analysis of the Model Depth in Transformers In
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop. Online: Association for Computational Linguistics, 2021.