Hey, I’m Anej.1

I’m a fourth-year PhD fellow at the ETH AI Center, studying language models with formal language theory to understand what they can (and can’t) do.

I’m co-advised by Prof. Ryan Cotterell and Prof. Valentina Boeva. In 2025, I did a 9-month research internship at the Allen Institute for AI (Ai2), where I worked with Ashish Sabharwal and William Merrill on reasoning and problem-solving in language models. In 2026, I am visiting Noah’s ARK lab at the University of Washington, working with Prof. Noah Smith.

I also co-organize the Formal Languages and Neural Networks (FLaNN) Seminar.

Research Interests

  • Expressivity of neural networks: What formal languages transformers and RNNs represent and learn? I study this using circuit complexity, formal language theory, logic, and weighted automata.
  • Reasoning in language models: What happens computationally when models “think step by step”? Can we design more efficient ways for models to think?
  • Diffusion models for text and looped transformers: How can we leverage parallel computation to make language models faster and more efficient?

News & Upcoming

Apr 2026Presenting MDM Reasoning as an oral at ICLR 2026!
Feb 2026Invited talk on Diffusion Language Models: Problem Solving and Reasoning at the AI@JSI Seminar, Jozef Stefan Institute.
Feb 2026Selected as one of 22 notable alumni of the first 30 years of UL’s CS department.
Jun 2025A Tale of Two Sides and Information Locality accepted as orals at ACL 2025! Information Locality was also selected for a panel discussion on the role of linguistics in LLM research (one of 5 papers).
Jul 2025Organizing a tutorial on The Underlying Logic of Language Models at ICML 2025.
Aug 2024Organizing a tutorial on Computational Expressivity of Neural Language Models at ACL 2024.
Jul 2023Lectured a course on Language Models and Formal Language Theory at ESSLLI 2023.

Selected Publications

OLMo Hybrid: From Theory to Practice and Back
William Merrill, Yanhong Li, Tyler Romero, Anej Svete, Caia Costello, Pradeep Dasigi, Dirk Groeneveld, David Heineman, Bailey Kuehl, Nathan Lambert, Chuan Li, Kyle Lo, Saumya Malik, DJ Matusz, Benjamin Minixhofer, Jacob Morrison, Luca Soldaini, Finbarr Timbers, Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi, Ashish Sabharwal · arXiv · [arXiv]
On the Reasoning Abilities of Masked Diffusion Language Models
Anej Svete, Ashish Sabharwal · ICLR 2026 · [arXiv]
Information Locality as an Inductive Bias for Neural Language Models
Information Locality as an Inductive Bias for Neural Language Models Oral at ACL 2025 · Panel: Role of Linguistics in LLM Research
Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, Ryan Cotterell · ACL 2025 · [arXiv]
Unique Hard Attention: A Tale of Two Sides
Selim Jerad, Anej Svete, Jiaoda Li, Ryan Cotterell · ACL 2025 · [arXiv]
Gumbel Counterfactual Generation From Language Models
Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson, Ryan Cotterell · ICLR 2025 · [arXiv]
Training Neural Networks as Recognizers of Formal Languages
Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell · ICLR 2025 · [arXiv]
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell · ACL 2024 · [arXiv]
On Affine Homotopy between Language Encoders
Robin SM Chan, Reda Boumasmoud, Anej Svete, Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Mennatallah El-Assady, Ryan Cotterell · NeurIPS 2024 · [arXiv]
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell · ACL 2024 · [arXiv]
Transformers Can Represent n-gram Language Models
Anej Svete, Ryan Cotterell · NAACL 2024
Formal Aspects of Language Modeling
Ryan Cotterell, Anej Svete, Clara Meister, Tianyu Liu, Li Du · arXiv · [arXiv]

See all publications →

Outside of Research

I like reading, cooking, running, and hiking. I also spend an unreasonable amount of time on aquascaping—the art of designing underwater landscapes. It’s niche, but a lot of fun.

  1. The easiest way is to imagine saying “an a” in American English. Not perfect, but close enough.