Self-Adapting LLMs Resemble Students Acquiring New Knowledge

Image Credits: AI-generated image

In an MIT classroom, a professor delivers a lecture as students take careful notes to review and absorb essential material in preparation for an exam.

Humans naturally learn and retain new information, but large language models (LLMs) lack this ability. Once deployed, a trained LLM has a fixed “brain” that can’t permanently incorporate new knowledge.

As a result, if a user shares important information with an LLM today, it won’t recall it in future conversations.

MIT Develops Method for LLMs to Self-Update Like Students

MIT researchers have now introduced a new method that allows LLMs to self-update and permanently absorb new information. Much like a student, the model creates its own study notes from user input and uses them to adjust its internal parameters. This work is detailed in a paper published on the arXiv preprint server.

The model produces several self-edits based on a single input and tests each to determine which yields the greatest performance boost. Through this trial-and-error process, it learns how to optimize its own training.

The researchers discovered that this method enhanced LLM accuracy in both question-answering and pattern-recognition tasks, even allowing a smaller model to surpass the performance of much larger ones.

Although challenges remain, this technique could eventually enable AI systems to continuously adapt to new tasks and dynamic objectives in ever-changing environments.

“Like humans, advanced AI systems can’t stay static throughout their lifetimes. LLMs operate in dynamic settings where they constantly encounter new user inputs. Our goal is to build a model that’s more human-like—one that can continuously improve itself,” says Jyothish Pari, an MIT graduate student and co-lead author of the paper describing the technique.

Pari co-authored the work with Adam Zweiger, an MIT undergraduate and fellow co-lead author; graduate students Han Guo and Ekin Akyürek; and senior authors Yoon Kim, an assistant professor in MIT’s Department of Electrical Engineering and Computer Science (EECS) and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Pulkit Agrawal, also an EECS assistant professor and CSAIL member.

The research will be presented at the Conference on Neural Information Processing Systems.

Training the Model to Acquire Knowledge

LLMs are neural network models with billions of parameters, known as weights, which store the model’s knowledge and help it generate predictions from inputs. During training, these weights are adjusted to incorporate information from the training data.

Once deployed, however, the weights become fixed and can no longer be permanently modified.

LLMs excel at in-context learning, where they learn a new task by observing a few examples. While these examples influence the model’s responses in the moment, the learned knowledge does not persist beyond the current interaction.

MIT researchers aimed to harness a model’s strong in-context learning abilities to train it to permanently adjust its weights when it acquires new information.

They developed a framework called SEAL, short for “self-adapting LLMs,” which allows an LLM to create synthetic data from an input and then figure out the most effective way to update itself using that data. Each piece of synthetic data serves as a self-edit the model can implement.

Overview of SEAL. In each RL outer loop iteration, the model generates candidate self-edits (SE)—directives on how to update the weights—applies updates, evaluates performance on a downstream task, and uses the resulting rewards to improve the self-edit generation policy. Image Credits: arXiv (2025). DOI: 10.48550/arxiv.2506.10943

LLMs Learn by Creating and Testing Synthetic Study Sheets

For language tasks, the LLM generates synthetic data by rephrasing the information and its implications from an input passage, much like students create study sheets by summarizing and rewriting lecture notes.

The model produces multiple versions and then tests each one to determine which self-edit yields the largest improvement on a downstream task, such as question answering. This trial-and-error process uses reinforcement learning, rewarding the model for the edits that boost performance the most.

Finally, the LLM internalizes the information from the most effective study sheet by updating its weights.

“Our goal is for the model to craft the most effective study sheet—one that provides the right level of detail and a balanced set of information—so that applying it to update the model enhances its overall performance,” Zweiger explains.

Selecting The Optimal Approach

The framework also lets the model decide how it wants to learn. It can choose which synthetic data to use, set its learning rate, and determine how many training iterations to perform.

In this way, the model not only generates its own training data but also manages how that data is applied to update its weights.

“As humans, we understand the methods that help us learn best. We aim to give LLMs a similar ability. By letting the model control how it processes information, it can determine the most effective way to handle the incoming data,” Pari explains.

SEAL outperformed several baseline approaches across a variety of tasks, including learning new skills from a few examples and integrating knowledge from a text passage. On question-answering tasks, SEAL increased model accuracy by nearly 15%, and for certain skill-learning tasks, it raised the success rate by over 50%.

A key limitation of this method is catastrophic forgetting: as the model continually adapts to new information, its performance on previously learned tasks gradually declines.

The researchers aim to address catastrophic forgetting in future work and explore applying this technique in multi-agent scenarios, where multiple LLMs train one another.

“One major obstacle for LLMs to perform meaningful scientific research is their current inability to update themselves when exposed to new information. While fully deployed self-adapting models are still a long way off, we hope that systems capable of learning in this manner could eventually address this limitation and contribute to scientific progress,” Zweiger says.

Read the original article on: Tech Xplore

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Self-Adapting LLMs Resemble Students Acquiring New Knowledge

MIT Develops Method for LLMs to Self-Update Like Students

Training the Model to Acquire Knowledge

LLMs Learn by Creating and Testing Synthetic Study Sheets

Selecting The Optimal Approach

Like this:

More posts

Can AI-Powered Drones, Robots, and Wearable Sensors Reshape Workplace Safety?

China Installed 2,200 AI Medical Booths Delivering 4-Minute Diagnoses

Japan has Created Technology that lets your Body Control Humanoid Robots

Chinese Robot sets new Milestone by Walking more than 100 km

Self-Adapting LLMs Resemble Students Acquiring New Knowledge

MIT Develops Method for LLMs to Self-Update Like Students

Training the Model to Acquire Knowledge

LLMs Learn by Creating and Testing Synthetic Study Sheets

Selecting The Optimal Approach

Share this:

Like this:

More posts

Can AI-Powered Drones, Robots, and Wearable Sensors Reshape Workplace Safety?

China Installed 2,200 AI Medical Booths Delivering 4-Minute Diagnoses

Japan has Created Technology that lets your Body Control Humanoid Robots

Chinese Robot sets new Milestone by Walking more than 100 km