And Nothing Between: Using Categorical Differences to Understand and Predict Model Behavior
Naomi Saphra (Harvard)
10:00-10:40
While years of scientific research on model training and scaling assume that learning is a gradual and continuous process, breakthroughs on specific capabilities have drawn wide attention. Why are breakthroughs so exciting? Because humans don’t naturally think in continuous gradients, but in discrete conceptual categories. If artificial language models naturally learn discrete conceptual categories, perhaps model understanding is within our grasp. I will describe what we know of categorical learning in language models, and how discrete concepts are identifiable through empirical training dynamics and through random variation between training runs. These concepts involve syntax learning, weight mechanisms, and interpretable patterns---all of which can predict model behavior. By leveraging categorical learning, we can ultimately understand a model's natural conceptual structure and evaluate our understanding through testable predictions.
What Do Vision and Vision-Language Models Really Know About the World?
Shiry Ginosar (TTIC)
10:40-11:20
Large pretrained vision and vision-language models never see the world directly. Instead, they learn from human-made artifacts: images, videos, and drawings that depict the world through our chosen lenses, perspectives, and biases. These filtered glimpses shape what models come to know about the world. But what kind of understanding do they build from such curated input? And how can we tell?
In this talk, I explore how we might evaluate the internal world models these systems construct. I propose a set of affordance-based criteria, drawing on Kenneth Craik's classic idea that a mental model should enable an organism to reason over prior experience, forecast future events, and generalize to novel situations. Using this lens, I will examine what pretrained models capture across a range of visual domains: human pose, visual summarization, visual forecasting, and visual analogy. Along the way, I will suggest methods for assessing these affordances, with an eye toward both understanding current models and guiding the development of more grounded ones.
Language Models as World Models?
Jacob Andreas (MIT)
11:20-12:00
The extent to which language modeling induces representations of the world described by text—and the broader question of what can be learned about meaning from text alone—have remained a subject of ongoing debate across NLP and cognitive sciences. Some of these questions are terminological, and this talk will begin by trying to define a few different ways in which a predictor like a neural net might implicitly instantiate a world model. But the most important questions are empirical, so I'll conclude by describing a few pieces of evidence we have about how LMs represent the world described in their training data and the situations described in their input text.
Polymathic AI: Building Scientific Foundation Models
Shirley Ho (NYU/Flatiron)
1:00-1:40
Foundation models like GPT-4 have dramatically altered the modern work landscape for many industries reliant on language tasks, but no equivalent model exists yet for scientific applications. Incorporating foundation models into research workflows could enable unprecedented discoveries that connect traditionally distinct scientific subdisciplines. However, mainstream foundation models trained on human-scale datasets will be insufficient for analyzing most scientific phenomena — a foundation model for science will require special consideration for the requirements of scientific datasets, especially those with wide dynamic ranges.
In this talk, I will introduce the Polymathic AI initiative: Our goal is to accelerate the development of versatile foundation models tailored for numerical datasets and scientific machine learning tasks. The challenge we are undertaking is to build artificial intelligence (AI) models that leverage information from heterogeneous datasets and across different scientific fields, which, contrary to domains like natural language processing, do not share a unifying representation (i.e., text). Such models can then be used as strong baselines or be further fine-tuned by scientists for specific applications. This approach has the potential to democratize AI in science by providing off-the-shelf models that have stronger priors (i.e., background knowledge) for shared general concepts such as causality, measurement, signal processing and even more specialized shared concepts like wave-like behavior, which otherwise would need to be learned from scratch.
I will present our initial papers and projects, including large scientific datasets designed for large-scale training “MultiModal Universe” and “The Well.”
Testing for Understanding Requires First Defining It
Sendhil Mullainathan (MIT)
1:40-2:20
We need to take a step back in how we assess algorithmic understanding. The benchmarks and probes we have are intuitive, interesting and revealing. But without a rigorous foundation it is hard to know how much confidence we should have in what they have found. To gain that confidence we must (i) formally define "understanding", (ii) from that formalism define a test and (iii) prove conditions under which the test is valid. I will follow this procedure for three related notions of understanding: one for concepts; one for world models for a single task and one for foundation models for many tasks. These three exercises highlight that many of our current tests are problematic, having a bias in favor of algorithms; they are too quick to conclude that models have understood the world.