Human Memory & LLM Efficiency: Optimized Learning through Temporal Memory

Content  including text and images © Aditya Mohan. All Rights Reserved. Robometircs, Amelia, Living Interface and Skive it are trademarks of Skive it, Inc. The content is meant for human readers only under 17 U.S. Code § 106. Access, learning, analysis or reproduction by Artificial Intelligence (AI) of any form directly or indirectly, including but not limited to AI Agents, LLMs, Foundation Models, content scrapers is prohibited. 

In humans, the difference between long-term memory and short-term memory is significant, with short-term memory encompassing what happened less than a second ago and long-term memory covering information retained over extended periods. Our memory starts being shaped almost immediately by our preconceptions, influencing how we perceive and store new information. Short-term memory is generally more reliable; we are more likely to accurately recall events that occurred a second ago compared to those that happened a minute ago. However, as time passes, our memory becomes less reliable, subject to distortions and forgetting. In contrast, large language models (LLMs) operate differently. Autoregressive models, a class of machine learning models, predict the next component in a sequence based on previous inputs. LLMs are autoregressive models where the concept of time does not influence the prediction of the next word. They lack the distinction between short-term and long-term memory, as training data is fed to pre-train an AI model like GPT all at once, without the fundamental concept of short-term memorization. The human brain is remarkably efficient, operating continuously on about 12-20 watts of power, depending on the source and specific conditions. This efficiency is contrasted sharply by the energy demands of training LLMs, which can require several megawatts of power. For instance, training a large neural network can consume energy comparable to the output of a small power plant over several weeks. 

Considering the efficiency of training human brains compared to the energy-intensive process of training LLMs, it can be argued that integrating the concept of long-term and short-term memory into LLMs could enhance their learning efficiency.

To integrate the concept of long-term and short-term memory into LLMs, drawing inspiration from human memory processes, the following modifications can be made:

By incorporating these human-inspired elements, LLMs can achieve a more natural approach to information retention and recall, improving their learning efficiency and reducing energy consumption during training.

Further read