OpenAI working on project ‘Strawberry’ for ‘Deep Research’ capabilities

July 13, 2024

0 165 4 minutes read

OpenAI working on project ‘Strawberry’ for ‘Deep Research’ capabilities

ChatGPT maker OpenAI is working on a new approach to its artificial intelligence models in a project codenamed “Strawberry,” according to a source and internal documentation seen by Reuters.

The project, details of which have not been previously disclosed, comes as the Microsoft-backed startup races to demonstrate that the models it offers can provide advanced reasoning capabilities.

Teams within OpenAI are working on Strawberry, according to a copy of a recent internal OpenAI document seen by Reuters in May. Reuters was unable to determine the exact date of the document, which outlines a plan for how OpenAI plans to use Strawberry to conduct research. The source described the plan to Reuters as a work in progress. The news agency was unable to determine how close Strawberry is to being made public.

How Strawberry works is a closely guarded secret even within OpenAI, the person said.

The document describes a project that uses Strawberry models with the aim of enabling the company’s AI to not only generate answers to questions, but also plan ahead enough to autonomously and reliably navigate the internet to conduct what OpenAI calls “deep research,” the source said.

This is something that AI models have so far been unable to handle, as interviews with more than a dozen AI researchers reveal.

Asked about Strawberry and the details reported in this story, an OpenAI spokesperson said in a statement: “We want our AI models to see and understand the world more like we do. Continuous research into new AI capabilities is common practice in the industry, with a shared belief that these systems will reason better over time.”

The spokesperson did not directly answer questions about Strawberry.

The Strawberry project, formerly known as Q*, was seen as a breakthrough within the company last year, Reuters reported.

Two sources said they had viewed Q* demos earlier this year, which OpenAI employees told them could answer difficult scientific and mathematical questions that are beyond the reach of current commercially available models.

On Tuesday, OpenAI showed off a demo of a research project it claimed had new human-like reasoning capabilities during an internal all-hands meeting, Bloomberg reported. An OpenAI spokesperson confirmed the meeting but declined to provide details about its content. Reuters could not determine whether the project shown was Strawberry.

OpenAI hopes the innovation will dramatically improve the reasoning capabilities of its AI models, a source said. Strawberry involves a specialized way of processing an AI model after it has been pre-trained on very large datasets.

According to researchers interviewed by Reuters, reasoning is the key to AI achieving human or superhuman intelligence.

While large language models can already summarize dense text and write elegant prose much faster than any human, the technology often falls short when it comes to common-sense problems whose solutions seem intuitive to humans, such as spotting fallacies and playing tic-tac-toe. When the model encounters these types of problems, it often “hallucinates” false information.

AI researchers interviewed by Reuters generally agree that reasoning in the context of AI involves forming a model that allows AI to plan ahead, represent how the physical world works and reliably solve challenging multi-step problems.

Improving the reasoning in AI models is seen as key to the models’ ability to do everything from making important scientific discoveries to planning and building new software applications.

Sam Altman, CEO of OpenAI, said earlier this year that in AI, “the key areas of advancement will be in reasoning capabilities.”

Other companies like Google, Meta, and Microsoft are also experimenting with various techniques to improve reasoning in AI models, as are most academic labs doing AI research. However, researchers disagree on whether large language models (LLMs) are capable of incorporating ideas and long-term planning into the way they make predictions. For example, one of the pioneers of modern AI, Yann LeCun, who works at Meta, has often said that LLMs are incapable of human reasoning.

AI Challenges

Strawberry is a key part of OpenAI’s plan to overcome those challenges, the source familiar with the matter said. The document seen by Reuters describes what Strawberry aims to accomplish, but not how.

In recent months, the company has privately told developers and other outsiders that it is about to release technology with significantly more advanced reasoning capabilities, according to four people who heard the company’s pitches, who declined to be named because they are not authorized to discuss private matters.

Strawberry involves a specialized way of what’s known as “post-training” OpenAI’s generative AI models, or tweaking the base models to improve their performance in specific ways after they’ve already been “trained” on piles of generalized data, one of the sources said.

The post-training phase of model development involves methods such as fine-tuning, a process used on almost all language models today and taking many forms, such as having humans provide feedback to the model based on the answers and feeding it examples of good and bad answers.

Strawberry shares similarities with a method developed at Stanford in 2022 called “Self-Taught Reasoner,” or “STaR,” one of the sources with knowledge of the matter said. STaR allows AI models to “bootstrap” themselves to higher intelligence levels by iteratively creating their own training data, and could in theory be used to help language models surpass human intelligence, one of its creators, Stanford professor Noah Goodman, told Reuters.

“I find that both exciting and scary… if things continue like this, we as humans have some serious things to think about,” Goodman said. Goodman is not affiliated with OpenAI and is not familiar with Strawberry.

One of the capabilities that OpenAI is targeting Strawberry for is long-horizon tasks (LHT), the document states. The first source explains that these are complex tasks that require a model to plan ahead and perform a series of actions over a longer period of time.

To do this, OpenAI creates, trains and evaluates the models on what the company calls a “deep-research” dataset, according to OpenAI’s internal documentation. Reuters was unable to determine what is in that dataset or how long an extended period would mean.

OpenAI specifically wants its models to use these capabilities to conduct research by autonomously browsing the web with the help of a “CUA,” or computer-using agent, that can take actions based on its findings, the document and one of the sources said. OpenAI also plans to test its capabilities on the work of software and machine learning engineers.

July 13, 2024

0 165 4 minutes read