What if artificial general intelligence was already here… and we simply failed to notice?

Some researchers now argue we quietly crossed it.

Instead of asking when machines will finally match us, a new wave of experts is asking a more unsettling question: what if today’s chatbots already qualify as generally intelligent, and the real problem is our definition of intelligence itself?

Rethinking the holy grail of ai

Artificial general intelligence, or AGI, is usually described as a system that can match human performance across a wide range of tasks. Not just play chess. Not just summarise emails. Everything from coding to conversation to basic reasoning.

For big labs such as OpenAI, Google DeepMind and Anthropic, AGI is the stated target. Timelines have shrunk from “sometime this century” to “within a decade”, and in some cases, to “maybe a couple of years”.

Yet a recent paper in the journal Nature makes a bolder claim: the researchers argue that, by any fair standard, current large language models (LLMs) already meet the bar for AGI.

They say the breakthrough is not in the future. It is sitting in our browsers and smartphone apps right now.

The authors, including philosopher Eddy Keming Chen from the University of California, argue that the stumbling block is not the technology, but our insistence on moving the goalposts every time machines get better.

Why the turing test may already be obsolete

In 1950, Alan Turing proposed a simple thought experiment: if you chat with an unknown entity by text and cannot reliably tell whether it is human or machine, the machine has passed the test.

For decades, the Turing test functioned as a kind of popular benchmark for “human-level” AI. A system that could routinely fool people in conversation would have been headline news.

That moment has arguably arrived. Multiple studies now show that advanced chatbots are judged to be human as often as, or more often than, actual humans in blind text conversations.

➡️ Emergency declared in Greenland as researchers spot orcas breaching near melting ice shelves

➡️ In Finland, homes are heated without radiators by using a simple everyday object most people already own

➡️ Probably F?15s, F?16s, F?22s And F?35s : Dozens Of US Jets Now Converging On The Middle East

➡️ The family vehicle everyone was waiting for is back with 7 seats and living space that redefines on-board comfort

➡️ From March 8, pensions will rise: but only for retirees who submit a missing certificate, leaving many saying

➡️ Official and confirmed : heavy snow is set to begin late tonight, with weather alerts warning of major disruptions, travel chaos, and dangerous conditions

➡️ Not 65, not 75 : the highway code has decided, here is the real age limit for driving

➡️ The nozzle isn’t hooked back” : gas station manager explains the scam hitting summer drivers

The paper’s authors argue that, if we applied older standards, several current models would already be hailed as full-blown AGI.

Instead, critics dismiss these systems as slick imitations. They “only predict the next word”. They “don’t really understand”. That tension lies at the heart of the new AGI debate.

AgI versus superintelligence: moving the goalposts

One of the central points in the Nature article is a distinction between two ideas that often get blurred:

AGI: roughly human-level competence across many tasks, with strengths and weaknesses, gaps and mistakes.
Superintelligence: a system that dramatically outperforms the best humans in almost every cognitive domain.

AGI, in this framing, does not mean perfection. Humans are not perfect. No one expects a top surgeon to also be a leading concert pianist and an expert in medieval history. Human intelligence is patchy, context-dependent and often biased.

If humans count as generally intelligent despite all our flaws, demanding flawlessness from machines sets an unrealistic double standard.

On many benchmarks, LLMs already reach or exceed expert-level performance: legal reasoning exams, medical question sets, advanced coding challenges. Where they fail, humans often fail too.

The authors suggest that what some companies now label “superintelligence” is closer to what the public has been taught to imagine when hearing the word “AGI”. In that sense, AGI might be behind us, while superintelligence remains ahead.

Are chatbots just ‘stochastic parrots’?

A popular critique paints LLMs as “stochastic parrots”: systems that only remix training data without genuine understanding. The Nature paper tackles this head-on.

Critics argue that if a model only predicts the next token, it can never do more than echo what it has seen. Yet these systems solve novel maths problems, generalise across languages and domains, and reason about situations that do not appear verbatim in their training sets.

When models tackle new tasks, combine skills in unexpected ways, or reason about hypothetical worlds, the “mere parrot” metaphor starts to creak.

The authors also warn against mythologising human creativity. Most scientific advances build on existing ideas. Many human arguments are derivative. If remixing, pattern recognition and recombination disqualify AIs from “real intelligence”, they also raise awkward questions about human minds.

Does intelligence need a body?

A long-running view in cognitive science links intelligence closely to the body. Our concepts of space, objects, even language are shaped by movement, senses and physical interaction.

LLMs today are mostly disembodied. They process text and, increasingly, images, audio and video. They do not bump into tables or feel cold.

For the Nature authors, this lack of a body does not automatically disqualify them from AGI status. They highlight several points:

Multimodal models already interpret images, video and sound, building a richer picture of their environment.
Robotics is catching up fast, with early “physical AI” systems linking language models to real-world actions.
Some human abilities, such as solving abstract puzzles or writing code, barely rely on bodily experience at all.

The paper claims that embodiment might shape how an intelligence thinks, but is not a strict requirement for calling it intelligent.

That stance clashes with researchers who see sensorimotor experience as non‑negotiable. The disagreement reveals a deeper philosophical divide rather than a simple technical dispute.

The uncomfortable problem of hallucinations

One clear weakness of today’s systems is hallucination: producing confident, detailed but false statements. Fabricated references, imaginary legal cases, invented quotes — users have seen them all.

Labs say hallucination rates have fallen with each new model, but independent studies suggest they remain stubbornly frequent. OpenAI has admitted that even a future GPT‑5 might hallucinate in roughly one out of ten answers.

Systems that can ace medical exams, then invent non‑existent studies a moment later, raise hard questions about reliability and trust.

The Nature authors compare this to human memory errors and cognitive biases. People misremember events, misquote sources and cling to incorrect beliefs. From their angle, fallibility is compatible with general intelligence.

Critics respond that, unlike humans, a chatbot has no grounded sense of truth or real-world stakes. It does not feel embarrassed when wrong. That gap, they argue, justifies continued caution before granting it the AGI label.

Why training data and effort may not matter as much as we think

Another argument against AGI claims is that LLMs need colossal amounts of data and compute to reach performance levels that children reach with a handful of experiences.

The Nature article suggests focusing on the outcome, not the training efficiency. A machine that requires billions of words to reach roughly human competence might still be generally intelligent, just in a different way from a child.

Aspect	Humans	Current LLMs
Learning speed	Fast, with few examples	Slow, data‑hungry
Energy use	Efficient brain, low power	High compute and electricity
Error types	Biases, memory slips	Hallucinations, brittle logic

From this angle, AGI is not a clone of human thinking. It is a different path to broadly capable behaviour, carrying different trade‑offs in efficiency, robustness and failure modes.

Anthropocentrism: are we refusing to see a new kind of mind?

Lurking behind all of this is a psychological factor: the fear of admitting that a non‑biological system might share our cognitive category.

Humans have a history of defining special boundaries — once for our planet, then for our species, now for our intelligence. As machines cross each threshold, the definition of what makes us unique quietly shifts.

The authors argue that our standards for calling something “intelligent” tighten precisely when machines start to meet them.

This pattern may explain why some tech leaders now talk about “superintelligence” rather than AGI. If AGI has effectively arrived, the next rhetorical frontier becomes an even more powerful, still-hypothetical entity.

What this debate means in practice

The question of whether AGI is “already here” is not just philosophical hair‑splitting. It has concrete consequences for regulation, safety work and public expectations.

If we treat current systems as narrow tools, we might underestimate their capacity for agency when connected to other software, robots or financial systems. If we treat them as full AGI, we might overestimate their reliability and grant them too much autonomy too soon.

For policymakers and companies, a more nuanced view can help. You can treat today’s models as:

Broadly capable assistants that operate across many domains.
Still unreliable enough to need layers of human oversight.
Powerful amplifiers of both useful work and existing risks.

Key terms that shape the argument

A few concepts sit at the centre of this debate and are worth understanding clearly.

AGI (artificial general intelligence): A system able to handle a wide range of tasks, adapt to new problems and show competence across different fields, roughly on par with humans.

Superintelligence: A future, hypothetical system exceeding the best human performance across almost every intellectual task, potentially by a large margin.

Hallucination: In AI, a specific kind of confident, fluent answer that is not grounded in facts or training data, yet looks convincing to a casual reader.

Embodied intelligence: The idea that having a body and sensory experience is fundamental to developing and expressing intelligence.

Everyday scenarios if the authors are right

Imagine a near-future hospital relying on an “AGI assistant” already similar to today’s top models. It reads scans, drafts notes, suggests diagnoses and explains options in plain language. Most of the time, it is as good as the specialists, sometimes better. Occasionally, it invents a non‑existent paper or confuses two rare conditions.

Or picture a classroom where each pupil has a personalised tutor chatbot. It recognises their gaps, sets exercises, adapts explanations and stays patient at 11pm. It can misjudge their emotional state or slip in a subtle error, but it outperforms many crowded classrooms on average learning gains.

In both scenarios, you get something that already looks a lot like general intelligence: broad capacity, flexibility, strong performance, real impact — and real flaws. Calling that AGI or not will shape how fast we deploy it, what safeguards we demand and how seriously we take long‑term risks.

Underneath the labels is a practical question: how do we live alongside systems that are no longer just tools in a narrow sense, yet not infallible or fully understood either? Whether we name them AGI or something else, that question is arriving faster than many expected.

Originally posted 2026-02-06 16:24:03.