With the rapid development of large language models, AI assistants like
ChatGPT have widely entered people’s works and lives. In this paper, we present
an evolving large language model assistant that utilizes verbal long-term
memory. It focuses on preserving the knowledge and experience from the history
dialogue between the user and AI assistant, which can be applied to future
dialogue for generating a better response. The model generates a set of records
for each finished dialogue and stores them in the memory. In later usage, given
a new user input, the model uses it to retrieve its related memory to improve
the quality of the response. To find the best form of memory, we explore
different ways of constructing the memory and propose a new memorizing
mechanism called conditional memory to solve the problems in previous methods.
We also investigate the retrieval and usage of memory in the generation
process. The assistant uses GPT-4 as the backbone and we evaluate it on three
constructed test datasets focusing on different abilities required by an AI
assistant with long-term memory.

An Evolving Large Language Model Assistant with Verbal Long-Term Memory

In recent years, language models like ChatGPT have become prevalent in various domains due to their ability to understand and generate human-like text. However, these models often lack the ability to remember and utilize information from previous interactions, limiting their potential for more advanced AI assistant applications.

In this paper, the authors introduce an evolving large language model assistant that addresses this limitation by incorporating verbal long-term memory. The goal is to preserve the knowledge and experience gained from past dialogues between the user and the AI assistant, enabling it to generate more coherent and contextually relevant responses in future interactions.

The key idea behind this approach is to generate a set of records for each completed dialogue and store them in a memory. This memory serves as a repository of past interactions, allowing the model to retrieve relevant information when faced with a new user input. By leveraging this verbal long-term memory, the model can improve the quality of its responses over time.

To optimize the memory performance, the researchers explore different ways of constructing the memory and propose a novel memorizing mechanism called conditional memory. This mechanism addresses limitations observed in previous methods by incorporating conditional information that aligns with the user input. By doing so, the model can selectively retrieve memories that are most relevant to the current context, further enhancing the quality of its responses.

In addition to memory construction, the paper also focuses on investigating the retrieval and usage of memory during the response generation process. This aspect is crucial for effectively incorporating long-term memory into the AI assistant’s functionality. By successfully retrieving relevant memories and incorporating them into its generated responses, the assistant can provide more accurate and comprehensive information to users.

The researchers evaluate the proposed model using GPT-4 as its underlying architecture and conduct experiments on three tailored test datasets. These datasets focus on different abilities that an AI assistant with long-term memory should possess, such as factual knowledge recall, continuity in conversation, and generating coherent responses.

Overall, the introduction of verbal long-term memory to large language model assistants represents a significant step towards creating more sophisticated and multifunctional AI systems. This approach bridges the gap between natural language understanding and retention of contextual information, highlighting the multi-disciplinary nature of concepts from fields like natural language processing, cognitive science, and information retrieval.

As further research progresses in this direction, we can anticipate the evolution of AI assistants that not only excel at understanding and generating text but also possess the ability to retain and leverage knowledge from past experiences, ultimately creating more personalized and context-aware interactions.

Read the original article