Watercolor painting of a chalkboard with tidy equations on one side and messy exploratory scrawl on the other
AI Transformation·3 min read

The Homework Problem

Google built an AI that does original research. The difference between that and everything before it is something every student already understands.

Share
Copied!

The Brief

This article explains why Google DeepMind's Aletheia marks a shift from AI that solves known problems to AI that conducts original research. It uses the homework-versus-research distinction to make the concept accessible, and examines why Aletheia's self-correcting architecture matters for business applications.


What is Google DeepMind's Aletheia?
Aletheia is an AI research agent built by Google DeepMind that conducts autonomous mathematical research. Powered by Gemini Deep Think, it uses a Generator-Verifier-Reviser loop to propose ideas, check its own work, and refine solutions. It produced the first fully AI-authored research paper in arithmetic geometry.
Why is Aletheia considered autonomous research rather than just AI problem-solving?
Previous AI achievements like passing bar exams involve finding known answers. Aletheia tackles open problems where no answer exists, chooses its own methods, checks its reasoning, and admits when it cannot solve something. DeepMind classified it as Level A2, essentially autonomous, on their research taxonomy.
How does Aletheia's architecture work?
Aletheia runs three components in a loop. A Generator proposes solutions, a Verifier checks for logical gaps and errors, and a Reviser rebuilds based on the critique. It also uses Google Search to verify citations against real sources, reducing the hallucination problem common in AI systems.
Why does Aletheia's approach matter for businesses?
Most business AI tools always produce an answer regardless of confidence. Aletheia's self-correcting architecture offers a model for more reliable AI in legal review, financial analysis, and research synthesis. AI that argues with itself before presenting results and admits uncertainty is more trustworthy than AI that's always confident.

Last week, a math paper appeared on a preprint server. The subject was something called eigenweights. I had to look up what that meant. But that's not why it stopped me. The mathematical reasoning, all of it, was produced by an AI called Aletheia. Google DeepMind built it. No human did the math.1

I've watched AI do impressive things this past year. Pass bar exams. Write code. Summarize dense research in seconds. But I realized all of those have something in common.

They're homework.

Someone already knew the answer. The AI found it. Whether it's a law exam or a coding challenge, there's a rubric somewhere with the right answer already on it. AI has gotten spectacularly good at homework.

Watercolor of a graded math test with red checkmarks lying next to an open blank journal on a wooden desk, warm afternoon light Homework has a rubric. Research doesn't.

Research is the opposite. Nobody knows the answer yet. You have to figure out which questions are even worth asking. You try approaches that might not work. You check your own reasoning. And sometimes you sit there and admit you're stuck. That's the gap between a student and a scientist. Aletheia crossed it.

When DeepMind put 700 unsolved math problems in front of it, Aletheia didn't try all of them. It attempted only the ones it had a real shot at. The rest, it left alone.1

That might not sound like much. But for anyone who's worked with AI, it's everything. It knew what it didn't know.

The architecture behind this isn't exotic. Aletheia runs three components in a loop. A Generator proposes ideas. A Verifier tears them apart, looking for logical gaps, bad assumptions, mistakes. A Reviser takes the critique and rebuilds.2

That's not a superintelligence. That's a Tuesday morning meeting. Someone pitches. Someone pokes holes. Someone refines.

Watercolor of three open notebooks on a wooden table, each with different handwriting and annotations pointing to each other, a coffee cup between them Generator. Verifier. Reviser. The breakthrough looks a lot like your best team.

The AI tools most businesses use today don't do any of that. They're homework machines. Fast, confident, and they always produce an answer. The trouble is that confidence isn't the same as accuracy. Ask one for a legal summary or a financial projection and you'll get something that sounds authoritative. Whether it holds up under scrutiny is a different question.

Aletheia's architecture suggests a different path. AI that checks its own work before giving you an answer. That verifies its citations against real published work instead of inventing them. That can say "I don't know" and mean it.1

DeepMind actually built a scale for this. Level H0 is human only. Level A2 is essentially autonomous. Aletheia is the first system to reach A2.2

They named it after the Greek word for truth.3 Not speed. Not power. Truth. For a technology that's spent the last few years being criticized for confidently making things up, that might be the most interesting choice they could have made.


References

Footnotes

  1. Luong, T. and Mirrokni, V. (2026). "Accelerating Mathematical and Scientific Discovery with Gemini Deep Think." Google DeepMind 2 3

  2. Sutter, M. (2026). "Google DeepMind Introduces Aletheia: The AI Agent Moving from Math Competitions to Fully Autonomous Professional Research Discoveries." MarkTechPost 2

  3. Upadhyay, A.T. (2026). "Aletheia Unveiled: Google's Autonomous Mathematical Research AI." Blog

Found this useful? Share it with others.

Share
Copied!

Browse the Archive

Explore all articles by date, filter by category, or search for specific topics.

Open Field Journal