Goodfire, a startup founded by researchers from OpenAI and Google DeepMind, has raised $50 million in Series A funding to push the boundaries of AI interpretability. The round was led by Menlo Ventures, with participation from Anthropic, Lightspeed Venture Partners, B Capital, Work-Bench, Wing, South Park Commons, and others.
The funding will accelerate the development of Ember, Goodfire’s core interpretability platform. Ember is designed to give developers direct, programmable access to the inner workings of neural networks, enabling them to inspect, control, and fine-tune model behavior from the inside out.
Unlike traditional black-box models, Ember opens up the neural mechanisms behind AI systems—making them more understandable, steerable, and aligned. This approach, known as mechanistic interpretability, allows users to decode internal representations like concepts, behaviors, and reasoning pathways within large models.
According to CEO Eric Ho, “Our vision is to build tools to make neural networks easy to understand, design, and fix from the inside out. This technology is critical for building the next frontier of safe and powerful foundation models.”
Goodfire is also collaborating with leading model developers and plans to release research previews demonstrating interpretability techniques across language models, image processing, and scientific modeling.
As Anthropic CEO Dario Amodei put it, “Mechanistic interpretability is one of the best bets to transform black-box neural networks into understandable, steerable systems—a critical foundation for responsible AI development.”