Ian Tenney

I am a Staff Research Scientist on the People + AI Research (PAIR) team at Google DeepMind. My group focuses on interpretability for large langauge models (LLMs), including visualization tools, attribution methods, and intrinsic analysis (a.k.a. BERTology) of model representations. Through these, we aim to answer questions like:

  • Why did a model make this particular prediction?
  • What kind of knowledge is stored in the parameters, and how is it represented and reasoned about?
  • How can we build our own mental models of how - and when - AI works?

Among other things, I am a co-creator of the Learning Interpretability Tool (LIT) and author of BERT Rediscovers the Classical NLP Pipeline.

Previously, I’ve taught an NLP course at UC Berkeley School of Information. In a past life I was a physicist, studying ultrafast molecular and optical physics in the lab of Philip H. Bucksbaum at Stanford / SLAC.

When I’m not behind a computer I enjoy hiking and photography; you can find some of it here.

Contact: "if" + lastname + "@gmail.com" (or @google.com)

news

Dec 13, 2024 New blog post & preprint on Scaling Training Data Attribution to understand what data an LLM learned from during open-domain pretraining. We’ve also released the dataset along with a web-based demo to explore influential examples for a variety of queries.
May 16, 2024 We’ve open-sourced LLM Comparator, a visualization tool to help LLM developers make sense of side-by-side evaluations. Learn more in our blog post and at goo.gle/llm-comparator, or jump in and try the in-browser demo.
Apr 15, 2024 New preprint! Interactive Prompt Debugging with Sequence Salience goes into more detail on the prompt debugging tool we previously released for Gemma. Sequence Salience now works for Mistral and Llama 2, and features a more in-depth tutorial at goo.gle/sequence-salience.

selected publications

  1. Scalable Influence and Fact Tracing for Large Language Model Pretraining
    Scalable Influence and Fact Tracing for Large Language Model Pretraining
    Tyler A. Chang, Dheeraj Rajagopal, Tolga Bolukbasi, Lucas Dixon, and Ian Tenney
    arXiv preprint, 2024
  2. LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
    LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
    Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, and Lucas Dixon
    IEEE Transactions on Visualization and Computer Graphics, 2024
  3. Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs
    Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs
    Kelvin Guu, Albert Webson, Ellie Pavlick, Lucas Dixon, Ian Tenney, and Tolga Bolukbasi
    arXiv preprint, 2023
  4. The MultiBERTs: BERT Reproductions for Robustness Analysis
    The MultiBERTs: BERT Reproductions for Robustness Analysis
    Thibault Sellam, Steve Yadlowsky, Ian Tenney, Jason Wei, Naomi Saphra, Alexander D’Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, and Ellie Pavlick
    ICLR (spotlight), 2022
  5. The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
    The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
    Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, and Ann Yuan
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020
  6. BERT Rediscovers the Classical NLP Pipeline
    BERT Rediscovers the Classical NLP Pipeline
    Ian Tenney, Dipanjan Das, and Ellie Pavlick
    In Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019
  7. What do you learn from context? Probing for sentence structure in contextualized word representations
    What do you learn from context? Probing for sentence structure in contextualized word representations
    Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R Thomas McCoy, Najoung Kim, Benjamin Van Durme, Sam Bowman, Dipanjan Das, and Ellie Pavlick
    In International Conference on Learning Representations, 2019