Meta’s Chief AI Scientist Yann LeCun on Wednesday (Jan. 8) criticized prevailing definitions of artificial general intelligence (AGI), arguing that continued scaling of text-based large language models alone will not suffice.
During a fireside chat at CES in Las Vegas, the Frenchman, winner of the Turing Award — computing’s Nobel Prize — disagreed with OpenAI CEO Sam Altman’s claim in his recent post that his teams already know how to build AGI and were looking past it to superintelligence.
LeCun said that large language models (LLMs) are not capable of reaching AGI, although he prefers to use the term “human-level intelligence.”
“There’s absolutely no way … that autoregressive LLMs, the type that we know today, will reach human intelligence,” LeCun said. “It’s just not going to happen.”
LLMs are trained to look at all the possible text in completing a sentence and then choose the best text. But human brains don’t just look at text but all types of modalities. Moreover, AI systems now mostly consist of so-called “narrow AI,” which can do certain specific tasks extremely well, like play chess or do medical diagnostics. But deviate somewhat and they fail.
“People in AI have been making that mistake all the time, saying, ‘OK, we have systems now that can beat us at chess, so pretty soon, they’ll be as smart as we are,” he said. “We have systems now that can drive a car through the desert. Pretty soon, we’ll have self-driving cars at Level 5. We still don’t have that, 13 years later.
“By assembling all of the systems, we’ll have systems that can do a lot of things, but that doesn’t mean they have human-level intelligence,” he continued. “That doesn’t mean they have the capacity to plan, reason … or understand the physical world.”
While AI systems might be good at cognitive tasks, they can’t do physical tasks like plumbing. “We’re not going to have an automated plumber anytime soon,” he said. “It’s incredibly complicated. It requires a very deep understanding of the physical world and manipulation [of objects].
“It’s not because we can’t build a robot. It’s just that we can’t get them to be smart enough,” LeCun continued. “In fact, we’re not even close to matching the understanding of the physical world of any animal, cat or dog.”
Another problem is that LLMs have been advancing in performance through scaling, which includes training them on ever larger amounts of data. But now scaling laws are hitting a point of diminishing returns. “Scaling is saturating,” LeCun said.
Even if scaling continues to make LLMs better, it remains “very expensive,” he added, which is why OpenAI, despite charging $200 a month for ChatGPT Pro, is “not making money with it.” (Altman disclosed it in a Jan. 5 post on X.)
But LeCun does see progress around the corner for AI-powered robots because of the rise of generative world models that create virtual worlds for robots to train in. It is less costly and less risky for enterprises than having robots train in the physical world.
On Monday (Jan. 6), Nvidia CEO Jensen Huang unveiled Cosmos, the company’s platform to create virtual worlds for robotics. Using a text, image or video prompt, developers can use Cosmos to generate troves of synthetic data to use for training of their “physical AI” systems such as robots and autonomous vehicles.
Google DeepMind is hiring a new team for its generative world models, while AI pioneer Fei Fei Li’s World Labs launched with $230 million in funding from Silicon Valley’s who’s who including AI pioneer and Nobel laureate Geoffrey Hinton, Salesforce CEO Marc Benioff, LinkedIn Co-Founder Reid Hoffman, former Google Chairman Eric Schmidt and others.
Asked when he thinks the “ChatGPT moment” for robotics will come, LeCun said that with the advent of world models they could be three to five years away.
However, LeCun does see AI agents becoming commonplace as people get used to having different types of AI assistants help them in their work. But these would be bots trained on certain tasks, not truly smart AI assistants that can perform activities from scratch without getting specific training.