A PYMNTS Company

Meta Releases AI That Can Check and Improve Other AI Without Human Input

 |  October 20, 2024

Meta, the parent company of Facebook, announced on Friday the release of several new artificial intelligence (AI) models from its research division. Among the highlights is a groundbreaking “Self-Taught Evaluator,” a tool designed to reduce human involvement in the development of AI systems. According to Reuters, this development represents a significant step toward creating autonomous AI agents capable of self-improvement and independent problem-solving.

The new tool, originally introduced in an August research paper, employs the “chain of thought” technique, a method that breaks down complex problems into smaller, logical steps. This approach, used by OpenAI’s recent models, enhances the accuracy of responses to intricate questions in fields like science, coding, and mathematics. Meta’s researchers further pushed the boundaries by using only AI-generated data to train the evaluator, bypassing the need for human input at this stage.

Per Reuters, the researchers behind the project, including Jason Weston, emphasized the potential of AI models that can reliably evaluate their own work. This ability could pave the way for autonomous AI systems that learn from their mistakes, a concept many in the field envision as a major advancement. Such systems could function as intelligent digital assistants, capable of handling a wide range of tasks with minimal or no human intervention.

Related: Meta Enhances User Data Control, Resolving German Antitrust Dispute

“We hope that as AI becomes increasingly superhuman, it will get better at checking its own work and surpass average human capabilities,” said Weston. He added that the ability to be self-taught and self-evaluative is central to reaching a superhuman level of AI.

One of the key advantages of these self-improving models lies in their potential to replace traditional methods like Reinforcement Learning from Human Feedback (RLHF). RLHF, a process that requires human annotators with specialized expertise, can be both costly and inefficient. By contrast, AI models that can self-evaluate could streamline development and improve accuracy without the need for human oversight.

Other tech giants, including Google and Anthropic, have also explored similar concepts, specifically Reinforcement Learning from AI Feedback (RLAIF). However, unlike Meta, these companies typically do not release their models for public use. Meta’s more open approach to sharing its research could lead to broader advancements in the AI community.

Source: Reuters