Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs
Published in EMNLP 2022, 2022
This work introduces novel algorithms for computing the pathsum of weighted finite-state automata with failure transitions.
Published in EMNLP 2022, 2022
This work introduces novel algorithms for computing the pathsum of weighted finite-state automata with failure transitions.
Published in arXiv, 2023
We propose a formal definition of intrinsic information about a concept (feature) in a subspace of a language model’s representation space. We propose a counterfactual approach that avoids the failure mode of spurious correlations by treating components in the subspace and its orthogonal complement independently.
Published in arXiv, 2023
We review the space complexity of simulating finite-state automata by Recurrent Neural Networks.
Published in arXiv, 2023
Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. Consequently, it is important for both developers and researchers alike to understand the mathematical foundations of large language models, as well as how to implement them. These notes are the accompaniment to the theoretical portion of the ETH Zürich course on large language models, covering what constitutes a language model from a formal, theoretical perspective.
Published in EMNLP 2023, 2023
This work investigates the computational expressivity of language models based on recurrent neural networks. We extend the Turing completeness result by Siegelmann and Sontag (1992) to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any probabilistic Turing machine (PTM).
Published in EMNLP 2023, 2023
We study what classes of such probability distributions RNN LMs can represent and show that simple RNNs are equivalent to a subclass of probabilistic finite-state automata, and can thus model a strict subset of probability distributions expressible by finite-state models.
Masters course, ETH Zurich, D-INFK, 2021
Masters course, ETH Zurich, D-INFK, 2021
Masters course, ETH Zurich, D-INFK, 2021
Masters course, ETH Zurich, D-INFK, 2022
Masters course, ETH Zurich, D-INFK, 2023
Masters course, ETH Zurich, D-INFK, 2023
Masters course, ETH Zurich, D-INFK, 2023
Masters course, ETH Zurich, D-INFK, 2023