Tuesday, December 16, 2025

Construct persistent reminiscence for agentic AI purposes with Mem0 Open Supply, Amazon ElastiCache for Valkey, and Amazon Neptune Analytics


At this time, we’re asserting a brand new integration between Mem0 Open Supply, Amazon ElastiCache for Valkey, and Amazon Neptune Analytics to offer persistent reminiscence capabilities to agentic AI purposes. This integration solves a crucial problem when constructing agentic AI purposes: with out persistent reminiscence, brokers neglect the whole lot between conversations, making it not possible to ship personalised experiences or full multi-step duties successfully.

On this submit, we present how you should use this new Mem0 integration with AWS databases to carry out the next actions:

  • Retailer and retrieve dialog historical past throughout a number of classes utilizing the high-performance vector storage of ElastiCache for Valkey
  • Monitor complicated entity relationships with Neptune Analytics for richer and extra contextual responses
  • Scale reminiscence operations to deal with thousands and thousands of requests with sub-millisecond latency

This integration works with agentic frameworks suitable with Mem0 Open Supply and might be hosted utilizing Amazon Bedrock AgentCore Runtime. AgentCore Runtime is framework agnostic and permits your agent to run with completely different massive language fashions (LLM), corresponding to fashions supplied by Amazon Bedrock, Anthropic Claude, Google Gemini, and OpenAI.

We stroll by way of a easy implementation to get you began. The pattern creates a GitHub repository analysis agent constructed with Strands Brokers, a framework to construct AI brokers. We are going to present you the structure, code, and efficiency enhancements you’ll be able to count on from this integration.

With this integration, you’ll be able to construct a self-managed reminiscence layer that mixes the capabilities of Mem0 with the storage capabilities of ElastiCache for Valkey and Neptune Analytics. When you desire a managed answer, you should use Mem0 Platform or AgentCore Reminiscence.

Understanding the problem of stateless AI brokers

An agentic AI utility is a system that takes actions and makes choices based mostly on enter. These brokers use exterior instruments, APIs, and multi-step reasoning to finish complicated duties. Nevertheless, brokers don’t retain reminiscence between conversations by default, which limits their capability to offer personalised responses or keep context throughout classes.

Agentic reminiscence handles the persistence, encoding, storage, retrieval, and summarization of information gained by way of person interactions. This reminiscence system is a crucial a part of the context administration part of an agentic AI utility, enabling brokers to be taught from previous conversations and apply that data to future interactions.

Agentic reminiscence consists of two fundamental varieties:

  • Quick-term reminiscence – Maintains context inside a single session, monitoring the present dialog move and up to date interactions
  • Lengthy-term reminiscence – Shops data throughout a number of classes, enabling brokers to recollect person preferences, previous choices, and historic context for future conversations

Answer overview

This memory-enabled agent structure makes use of 5 elements that work collectively to retailer, retrieve, and handle persistent reminiscence:

  • Amazon Bedrock AgentCore Runtime – AgentCore Runtime supplies a internet hosting atmosphere for deploying and operating our agent. It supplies entry to the LLM and embedding fashions required for our structure.
  • Strands Brokers – Strands is a code-first framework for constructing brokers. Strands manages LLM invocations, device execution, and person conversations. It helps a number of LLMs, together with these from Amazon Bedrock, Anthropic, Gemini, and OpenAI. On this instance, Strands orchestrates an agent that makes use of an HTML device to browse the net, however you should use it to construct multi-agent techniques. Strands contains integration with Mem0 for reminiscence administration.
  • Mem0 – This reminiscence orchestration layer sits between AI brokers and storage techniques. Mem0 manages the reminiscence lifecycle, from extracting data from agent interactions to storing and retrieving it effectively. It supplies unified APIs for working with completely different reminiscence varieties, together with episodic, semantic, procedural, and associative reminiscences. Mem0 handles reminiscence operations corresponding to computerized filtering to stop reminiscence bloat, decay mechanisms that take away irrelevant data over time, and value optimization options that cut back LLM bills by way of immediate injection and semantic caching.
  • Amazon ElastiCache for Valkey – This managed in-memory knowledge retailer serves because the vector storage part of this reminiscence structure. ElastiCache for Valkey makes use of Valkey’s open supply vector similarity search capabilities to retailer high-dimensional vector embeddings. This permits semantic reminiscence retrieval, permitting brokers to seek out related reminiscences based mostly on that means relatively than precise key phrase matches. ElastiCache for Valkey supplies microsecond-level latency for reminiscence operations, making it appropriate for real-time agent interactions. The service helps real-time index updates so new reminiscences develop into instantly searchable, and contains semantic caching capabilities that cut back LLM prices by figuring out and reusing responses to related queries.
  • Amazon Neptune Analytics – This in-memory graph analytics retailer helps querying complicated entity relationships and associations. Neptune Analytics represents the connections between individuals, ideas, occasions, and concepts as a data graph. This permits multi-hop reasoning throughout related reminiscences, permitting brokers to traverse relationship paths to find related context past what vector similarity search alone can discover. The service helps hybrid retrieval methods that mix graph traversal with vector similarity search.

The next diagram illustrates the info move from AgentCore Runtime to Mem0 and AWS databases for persistence.

Within the following sections, we present how one can construct a GitHub repository analysis agent that helps builders discover related tasks and remembers the important thing metrics for the mission and the elements of the mission the place completely different customers work.

Stipulations

Stipulations to those weblog embody:

  1. A ElastiCache cluster operating Valkey 8.2. Valkey 8.2 contains assist for Vector Similarity Search. Observe the steps within the ElastiCache documentation to create one.
  2. A Neptune Graph configured to assist vector indexes and public entry. Observe the steps within the Neptune documentation to create your graph.

Create demo agent

To create our demo agent utilizing Strands, full the next steps:

  1. Set up Strands and the instruments improvement packages:
    pip set up strands-agents
    pip set up strands-agents-tools strands-agents-builder

  2. Initialize your agent and name it with the person immediate:
    # Initialize Agent with entry to the device to browse the net
    agent = Agent(instruments=[http_request);
    # Strands expects: [{"role": "user", "content": [{"text": "hello"}]}]
    formatted_messages = [{"role": "user", "content": [{"text": " what is the url for the project mem0 and its most important metrics?"}]}]
    end result = agent(formatted_messages);

With out reminiscence, the agent performs the identical analysis duties repeatedly for every request. The next screenshot reveals how the agent works with out reminiscence capabilities. The agent makes three device calls to reply the request, utilizing 70,373 tokens and taking 9.25 seconds to finish.

Add reminiscence capabilities with ElastiCache for Valkey

Now let’s add Mem0 to retailer the agent’s reminiscences in AWS databases. We use Mem0 with ElastiCache for Valkey because the vector retailer to save lots of and retrieve reminiscences for every repository the agent discovers. When the agent finds a repository, it shops that data as a reminiscence for future use.

So as to add Mem0 with ElastiCache for Valkey, full the next steps:

  1. Observe the directions to entry your ElastiCache cluster so you’ll be able to connect with clusters out of your improvement desktop.
  2. Set up mem0ai and the Valkey vector retailer connector in your mission:
    pip set up mem0ai
    pip set up mem0ai[vector_stores]

  3. Configure Valkey because the vector retailer. ElastiCache for Valkey helps vector search capabilities beginning with model 8.2. Configure the Valkey connector for Mem0 following the directions within the Mem0 documentation:
    from mem0 import Reminiscence
    # Configure Mem0 with ElastiCache for Valkey
    config = {
        "vector_store": {
            "supplier": "valkey",
            "config": {
                "valkey_url": "your-elasticache-cluster.cache.amazonaws.com:6379",
                "index_name": "agent_memory",
                "embedding_model_dims": 1024,
                "index_type": "flat"
            }
        }
    }
    m = Reminiscence.from_config(config)

  4. Now you’ll be able to add two new instruments to the Strands agent so it may possibly retailer and seek for reminiscences. The @device decorator supplies a simple option to remodel common Python features into instruments that the agent can use. The agent initialization should occur after the device definition.
    @device
    def store_memory_tool(data: str, user_id: str = "person") -> str:
        """Retailer necessary data in long-term reminiscence."""
        memory_message = [{"role": "user", "content": information}]
        
        # Create new reminiscences utilizing Mem0 and retailer them in Valkey
        m.add(memory_message, user_id=user_id)
        
        return f"Saved: {data}"
    @device
    def search_memory_tool(question: str, user_id: str = "person") -> str:
        """Search saved reminiscences for related data."""
        
        # Search reminiscences utilizing Mem0 saved Valkey
        outcomes = m.search(question, user_id=user_id)
        if outcomes['results']:
            return "n".be a part of([r['memory'] for r in outcomes['results']])
        return "No reminiscences discovered"
    # Initialize Strands agent with reminiscence instruments
    agent = Agent(instruments=[http_request, store_memory_tool, search_memory_tool])

With reminiscence enabled, the agent remembers repository data from earlier searches. Whenever you make the identical request once more, the agent retrieves the data from reminiscence as an alternative of creating device calls, utilizing solely 6,344 tokens and finishing the request in 2 seconds. That is 12 occasions fewer tokens and greater than 3 occasions sooner than with out reminiscence as a result of the agent now not must scan the net in search of this data.

Improve with graph reminiscence utilizing Neptune Analytics

Utilizing the graph extension in Mem0 outperforms conventional reminiscence techniques by utilizing graph-based reminiscence representations. it may possibly seize complicated relationships between entities and assist superior reasoning throughout interconnected details. In comparison with conventional techniques, graph reminiscence excels in temporal and open-domain duties, attaining larger accuracy and higher semantic coherence. It additionally provides structured relational readability, which is especially useful for duties requiring nuanced contextual and temporal integration. You should use Neptune Analytics because the graph retailer in Mem0 for reminiscence retrieval and reasoning, enabling long-term reminiscence for AI brokers throughout richly related graphs—powering extra personalised and context-aware AI experiences. It permits graph-based long-term reminiscence for LLMs by utilizing Neptune as an exterior reminiscence retailer, bettering response high quality by way of multi-hop graph reasoning, and supporting hybrid retrieval throughout graph, vector, and key phrase modalities.

Let’s enhance our agent to assist extra sophisticated queries that think about graph relationships. We are going to permit our agent to retailer details about customers of a repository in a graph knowledge retailer so it may possibly later assist us ask questions on the place completely different customers work, and their areas of experience.So as to add assist for reminiscences utilizing Neptune Analytics graphs, full the next steps:

  1. Set up the Mem0 graph extension:
    pip set up mem0ai[graph]

  2. Configure Neptune Analytics because the graph retailer. Neptune Analytics helps vector search indexes for graph knowledge. For directions to configure the Neptune Analytics connector for Mem0, consult with the Mem0 documentation.
    from mem0 import Reminiscence
    config = {
        "graph_store": {
            "supplier": "neptune",
            "config": {
                "endpoint": "neptune-graph://",
            },
        },
    }
    m = Reminiscence.from_config(config_dict=config)

  3. Add reminiscence instruments for graph relationships. This permits the agent to choose the most effective vacation spot for his or her reminiscences based mostly on the kind of knowledge it’s storing.
    @device
    def store_graph_memory_tool(data: str, user_id: str = "person", class: str = “relationships”) -> str:
        """Retailer relationship-based data, connections or structured knowledge in graph-based reminiscence."""
        memory_message = [{"role": "user", "content": f"RELATIONSHIP: 
                 {information}"}]
            m.add(memory_message, user_id=user_id, metadata={"class": class,    
                 "storage_type": "graph"})
        return f"Saved: {data}"
    @device
    def search_graph_memory_tool(question: str, user_id: str = "person") -> str:
        """Search by way of graph-based reminiscences to seek out relationship and connection data."""
        graph_query = f"relationships connections {question}"
        outcomes = m.search(graph_query, user_id=user_id)

The analysis agent can now use a graph-based reminiscence system to retailer and retrieve data. It represents developer preferences and mission particulars as entities (nodes) and their relationships (edges) in a data graph. When retrieving related tasks, the agent identifies key entities (corresponding to developer preferences and mission attributes) and makes use of embeddings to seek out related nodes. It then traverses the relationships between these nodes to assemble a subgraph of related tasks and preferences based mostly on the person’s historical past.

Let’s take a look at our agent’s capability to retailer data in a data graph. We begin by asking our agent to seek out are the highest contributors for Mem0 and the modules the place they’ve labored essentially the most. We additionally ask the agent to retailer this data in a graph. The next screenshot reveals our outcomes.

Now we will take a look at how good our agent is at utilizing this newly created graph of reminiscences by asking data that requires traversing the graph. We ask who works within the core packages and the SDK updates.

Lastly, we will ask who the contributors are for the core package deal.

Clear up

You’ll be able to take away any pointless sources as soon as you might be completed testing this agent, together with the ElastiCache cluster and Neptune graph used to retailer agentic reminiscences.

Conclusion

The combination between Mem0, ElastiCache for Valkey, and Neptune Analytics supplies a production-ready answer for including persistent reminiscence to your AI brokers. By combining vector storage for semantic reminiscence with graph-based relationship monitoring, you’ll be able to construct brokers that bear in mind person preferences, keep dialog context, and ship personalised experiences throughout classes.

You’ll be able to dive deeper into agentic reminiscence ideas by exploring the Mem0 documentation, reviewing the Strands Brokers person information, and studying extra about Valkey vector search and Neptune graph queries within the AWS documentation.


In regards to the authors

Deshraj Yadav

Deshraj Yadav

Deshraj is the Co-founder and CTO of Mem0.ai, the Reminiscence Layer for Customized AI. Beforehand at Tesla, he led the event of the AI Platform for AutoPilot. His focus spans throughout large-scale AI techniques, with specific experience in machine studying infrastructure and synthetic intelligence.

David Castro

David Castro

David is a Principal Product Supervisor at AWS, spearheading the event of AWS database capabilities for agentic reminiscence and next-generation caching options. With over eight years at Amazon since 2017, he has persistently pushed the imaginative and prescient, technique, and execution of extremely scalable knowledge merchandise throughout Alexa and AWS. Presently, David leads product for Amazon ElastiCache Serverless.

Ozan Eken

Ozan Eken

Ozan is a Product Supervisor at AWS, and is enthusiastic about constructing cutting-edge generative AI and graph analytics merchandise. With a deal with simplifying complicated knowledge challenges, Ozan helps prospects unlock deeper insights and speed up innovation. Exterior of labor, he enjoys making an attempt new meals, exploring completely different international locations, and watching soccer.

Swarnaprakash Udayakumar

Swarnaprakash Udayakumar

Swarnaprakash is a Principal Engineer at AWS, the place he has been main technical improvements throughout a number of AWS companies over the previous 11 years. Prakash led the technical technique and execution for a number of key initiatives inside Amazon ElastiCache, corresponding to serverless, auto scaling, and enormous cluster assist.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles