
The Homework Problem
Google built an AI that does original research. The difference between that and everything before it is something every student already understands.
The Brief
This article explains why Google DeepMind's Aletheia marks a shift from AI that solves known problems to AI that conducts original research. It uses the homework-versus-research distinction to make the concept accessible, and examines why Aletheia's self-correcting architecture matters for business applications.
- What is Google DeepMind's Aletheia?
- Aletheia is an AI research agent built by Google DeepMind that conducts autonomous mathematical research. Powered by Gemini Deep Think, it uses a Generator-Verifier-Reviser loop to propose ideas, check its own work, and refine solutions. It produced the first fully AI-authored research paper in arithmetic geometry.
- Why is Aletheia considered autonomous research rather than just AI problem-solving?
- Previous AI achievements like passing bar exams involve finding known answers. Aletheia tackles open problems where no answer exists, chooses its own methods, checks its reasoning, and admits when it cannot solve something. DeepMind classified it as Level A2, essentially autonomous, on their research taxonomy.
- How does Aletheia's architecture work?
- Aletheia runs three components in a loop. A Generator proposes solutions, a Verifier checks for logical gaps and errors, and a Reviser rebuilds based on the critique. It also uses Google Search to verify citations against real sources, reducing the hallucination problem common in AI systems.
- Why does Aletheia's approach matter for businesses?
- Most business AI tools always produce an answer regardless of confidence. Aletheia's self-correcting architecture offers a model for more reliable AI in legal review, financial analysis, and research synthesis. AI that argues with itself before presenting results and admits uncertainty is more trustworthy than AI that's always confident.
Last week, a math paper appeared on a preprint server. The subject was something called eigenweights. I had to look up what that meant. But that's not why it stopped me. The mathematical reasoning, all of it, was produced by an AI called Aletheia. Google DeepMind built it. No human did the math.1
I've watched AI do impressive things this past year. Pass bar exams. Write code. Summarize dense research in seconds. But I realized all of those have something in common.
They're homework.
Someone already knew the answer. The AI found it. Whether it's a law exam or a coding challenge, there's a rubric somewhere with the right answer already on it. AI has gotten spectacularly good at homework.
Homework has a rubric. Research doesn't.
Research is the opposite. Nobody knows the answer yet. You have to figure out which questions are even worth asking. You try approaches that might not work. You check your own reasoning. And sometimes you sit there and admit you're stuck. That's the gap between a student and a scientist. Aletheia crossed it.
When DeepMind put 700 unsolved math problems in front of it, Aletheia didn't try all of them. It attempted only the ones it had a real shot at. The rest, it left alone.1
That might not sound like much. But for anyone who's worked with AI, it's everything. It knew what it didn't know.
The architecture behind this isn't exotic. Aletheia runs three components in a loop. A Generator proposes ideas. A Verifier tears them apart, looking for logical gaps, bad assumptions, mistakes. A Reviser takes the critique and rebuilds.2
That's not a superintelligence. That's a Tuesday morning meeting. Someone pitches. Someone pokes holes. Someone refines.
Generator. Verifier. Reviser. The breakthrough looks a lot like your best team.
The AI tools most businesses use today don't do any of that. They're homework machines. Fast, confident, and they always produce an answer. The trouble is that confidence isn't the same as accuracy. Ask one for a legal summary or a financial projection and you'll get something that sounds authoritative. Whether it holds up under scrutiny is a different question.
Aletheia's architecture suggests a different path. AI that checks its own work before giving you an answer. That verifies its citations against real published work instead of inventing them. That can say "I don't know" and mean it.1
DeepMind actually built a scale for this. Level H0 is human only. Level A2 is essentially autonomous. Aletheia is the first system to reach A2.2
They named it after the Greek word for truth.3 Not speed. Not power. Truth. For a technology that's spent the last few years being criticized for confidently making things up, that might be the most interesting choice they could have made.
References
Footnotes
-
Luong, T. and Mirrokni, V. (2026). "Accelerating Mathematical and Scientific Discovery with Gemini Deep Think." Google DeepMind ↩ ↩2 ↩3
-
Sutter, M. (2026). "Google DeepMind Introduces Aletheia: The AI Agent Moving from Math Competitions to Fully Autonomous Professional Research Discoveries." MarkTechPost ↩ ↩2
-
Upadhyay, A.T. (2026). "Aletheia Unveiled: Google's Autonomous Mathematical Research AI." Blog ↩
More to Explore

Mrinank Sharma, Please Come Back to Work!
He spent two years proving AI needs a contradictory voice. Then he quit to study poetry.

The Room You Chose
I told you to find your room. I didn't mention the cost of leaving the one you're already in.

Beware of Frankenstein!
Quit trying to save money with spare parts.
Browse the Archive
Explore all articles by date, filter by category, or search for specific topics.
Open Field Journal