Formal Aspects of Language Modeling

Published in arXiv, 2023

Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. Consequently, it is important for both developers and researchers alike to understand the mathematical foundations of large language models, as well as how to implement them. These notes are the accompaniment to the theoretical portion of the ETH Zürich course on large language models, covering what constitutes a language model from a formal, theoretical perspective.

Download the book here

Citation BibTeX:

@book{cotterell2023formal,
      title={Formal Aspects of Language Modeling}, 
      author={Ryan Cotterell and Anej Svete and Clara Meister and Tianyu Liu and Li Du},
      year={2023},
      eprint={2311.04329},
      archivePrefix={arXiv},
      journal = {arXiv preprint arXiv:2311.04329},
      url={https://arxiv.org/abs/2311.04329},
      primaryClass={cs.CL}
}