On this CodeLab, you’ll discover ways to construct a Resort Search Agent utilizing LangChain, Couchbase AI Companies, and Agent Catalog. We may also incorporate Arize Phoenix for observability and analysis to make sure our agent performs reliably.
This tutorial takes you from zero to a completely purposeful agent that may seek for accommodations, filter by facilities, and reply pure language queries utilizing real-world knowledge.
Notice: Yow will discover the complete Google CodeLab pocket book for this CodeLab right here.
What Are Couchbase AI Companies?
Constructing AI functions typically includes juggling a number of providers: a vector database for reminiscence, an inference supplier for LLMs (like OpenAI or Anthropic), and separate infrastructure for embedding fashions.
Couchbase AI Companies streamlines this by offering a unified platform the place your operational knowledge, vector search, and AI fashions dwell collectively. It gives:
- LLM inference and embeddings API: Entry widespread LLMs (like Llama 3) and embedding fashions immediately inside Couchbase Capella, with no exterior API keys, no additional infrastructure, and no knowledge egress. Your software knowledge stays inside Capella. Queries, vectors, and mannequin inference all occur the place the information lives. This allows safe, low-latency AI experiences whereas assembly privateness, compliance necessities. Thus, the important thing worth: knowledge and AI collectively, with out sending delicate data exterior your system.
- Unified platform: Database + Vectorization + Search + Mannequin
- Built-in vector search: Carry out semantic search immediately in your JSON knowledge with millisecond latency.
Why Is This Wanted?
As we transfer from easy chatbots to agentic workflows, the place AI fashions autonomously use instruments, latency, and complexity of setup develop into bottlenecks. By co-locating your knowledge and AI providers, you cut back the operational overhead and latency. Moreover, instruments just like the Agent Catalog assist with managing a whole lot of agent prompts and instruments and supply inbuilt logging in your brokers.
Conditions
Earlier than we start, guarantee you’ve:
- A Couchbase Capella account.
- Python 3.10+ put in.
- Primary familiarity with Python and Jupyter notebooks.
Create a Cluster in Couchbase Capella
- Log into Couchbase Capella.
- Create a brand new cluster or use an present one. Notice that the cluster must run the most recent model of Couchbase Server 8.0 with the Knowledge, Question, Index, and the Eventing providers.
- Create a bucket.
- Create a scope and assortment in your knowledge.
Step 1: Set up Dependencies
We’ll begin by putting in the required packages. This consists of the couchbase-infrastructure helper for setup, the agentc CLI for the catalog, and the LangChain integration packages.
|
%pip set up –q
“pydantic>=2.0.0,<3.0.0” “python-dotenv>=1.0.0,<2.0.0” “pandas>=2.0.0,<3.0.0” “nest-asyncio>=1.6.0,<2.0.0” “langchain-couchbase>=0.2.4,<0.5.0” “langchain-openai>=0.3.11,<0.4.0” “arize-phoenix>=11.37.0,<12.0.0” “openinference-instrumentation-langchain>=0.1.29,<0.2.0” “couchbase-infrastructure”
# Set up Agent Catalog %pip set up agentc==1.0.0 |
Step 2: Infrastructure as Code
As an alternative of manually clicking via the UI, we use the couchbase-infrastructure bundle to programmatically provision our Capella atmosphere. This ensures a reproducible setup.
We’ll:
- Create a Mission and Cluster.
- Deploy an Embedding Mannequin (
nvidia/llama-3.2-nv-embedqa-1b-v2) and an LLM (meta/llama3-8b-instruct). - Load the
travel-sampledataset.
Couchbase AI Companies gives OpenAI-compatible endpoints which might be utilized by the brokers.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
import os from getpass import getpass from couchbase_infrastructure import CapellaConfig, CapellaClient from couchbase_infrastructure.assets import ( create_project, create_developer_pro_cluster, add_allowed_cidr, load_sample_data, create_database_user, deploy_ai_model, create_ai_api_key, )
# 1. Gather Credentials management_api_key = getpass(“Enter your MANAGEMENT_API_KEY: “) organization_id = enter(“Enter your ORGANIZATION_ID: “)
config = CapellaConfig( management_api_key=management_api_key, organization_id=organization_id, project_name=“agent-app”, cluster_name=“agent-app-cluster”, db_username=“agent_app_user”, sample_bucket=“travel-sample”, # Utilizing Couchbase AI Companies for fashions embedding_model_name=“nvidia/llama-3.2-nv-embedqa-1b-v2”, llm_model_name=“meta/llama3-8b-instruct”, )
# 2. Provision Cluster shopper = CapellaClient(config) org_id = shopper.get_organization_id() project_id = create_project(shopper, org_id, config.project_name) cluster_id = create_developer_pro_cluster(shopper, org_id, project_id, config.cluster_name, config)
# 3. Community & Knowledge Setup add_allowed_cidr(shopper, org_id, project_id, cluster_id, “0.0.0.0/0”) # Permit all IPs for tutorial load_sample_data(shopper, org_id, project_id, cluster_id, config.sample_bucket) db_password = create_database_user(shopper, org_id, project_id, cluster_id, config.db_username, config.sample_bucket)
# 4. Deploy AI Fashions print(“Deploying AI Fashions…”) deploy_ai_model(shopper, org_id, config.embedding_model_name, “agent-hub-embedding-model”, “embedding”, config) deploy_ai_model(shopper, org_id, config.llm_model_name, “agent-hub-llm-model”, “llm”, config)
# 5. Generate API Keys api_key = create_ai_api_key(shopper, org_id, config.ai_model_region) |
Guarantee to comply with the steps to setup the safety root certificates. Safe connections to Couchbase Capella require a root certificates for TLS verification. Yow will discover this within the ## 📜 Root Certificates Setup part of the Google Colab Pocket book.
Step 3: Integrating Agent Catalog
The Agent Catalog is a robust instrument for managing the lifecycle of your agent’s capabilities. As an alternative of hardcoding prompts and power definitions in your Python information, you handle them as versioned belongings. You possibly can centralize and reuse your instruments throughout your improvement groups. You can even look at and monitor agent responses with the Agent Tracer.
Initialize and Obtain Property
First, we initialize the catalog and obtain our pre-defined prompts and instruments.
|
!git init !agentc init
# Obtain instance instruments and prompts !mkdir –p prompts instruments !wget –O prompts/hotel_search_assistant.yaml https://uncooked.githubusercontent.com/couchbase-examples/agent-catalog-quickstart/refs/heads/fundamental/notebooks/hotel_search_agent_langchain/prompts/hotel_search_assistant.yaml !wget –O instruments/search_vector_database.py https://uncooked.githubusercontent.com/couchbase-examples/agent-catalog-quickstart/refs/heads/fundamental/notebooks/hotel_search_agent_langchain/instruments/search_vector_database.py !wget –O agentcatalog_index.json https://uncooked.githubusercontent.com/couchbase-examples/agent-catalog-quickstart/refs/heads/fundamental/notebooks/hotel_search_agent_langchain/agentcatalog_index.json |
Index and Publish
We use agentc to index our native information and publish them to Couchbase. This shops the metadata in your database, making it searchable and discoverable by the agent at runtime.
|
# Create native index of instruments and prompts !agentc index .
# Add to Couchbase !agentc publish |
Step 4: Making ready the Vector Retailer
To allow our agent to seek for accommodations semantically (e.g., “cozy place close to the seaside”), we have to generate vector embeddings for our lodge knowledge.
We outline a helper to format our lodge knowledge right into a wealthy textual content illustration, prioritizing location and facilities.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
from langchain_couchbase.vectorstores import CouchbaseVectorStore
def load_hotel_data_to_couchbase(cluster, bucket_name, scope_name, collection_name, embeddings, index_name): # Examine if knowledge exists # … (omitted for brevity) …
# Generate wealthy textual content for every lodge # e.g., “Le Clos Fleuri in Giverny, France. Facilities: Free breakfast: Sure…” hotel_texts = get_hotel_texts()
# Initialize Vector Retailer related to Capella vector_store = CouchbaseVectorStore( cluster=cluster, bucket_name=bucket_name, scope_name=scope_name, collection_name=collection_name, embedding=embeddings, index_name=index_name, )
# Batch add texts vector_store.add_texts(texts=hotel_texts) print(f“Efficiently loaded {len(hotel_texts)} lodge embeddings”) |
Step 5: Constructing the LangChain Agent
We use the Agent Catalog to fetch our instrument definitions and prompts dynamically. The code stays generic, whereas your capabilities (instruments) and character (prompts) are managed individually. We may also create our ReAct brokers.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
import agentc from langchain.brokers import AgentExecutor, create_react_agent from langchain_core.prompts import PromptTemplate from langchain_core.instruments import Instrument
def create_langchain_agent(self, catalog, span): # 1. Setup AI Companies utilizing Capella endpoints embeddings, llm = setup_ai_services(framework=“langchain”)
# 2. Uncover Instruments from Catalog # The catalog.discover() methodology searches your revealed catalog tool_search = catalog.discover(“instrument”, identify=“search_vector_database”)
instruments = [ Tool( name=tool_search.meta.name, description=tool_search.meta.description, func=tool_search.func, # The actual python function ), ]
# 3. Uncover Immediate from Catalog hotel_prompt = catalog.discover(“immediate”, identify=“hotel_search_assistant”)
# 4. Assemble the Immediate Template custom_prompt = PromptTemplate( template=hotel_prompt.content material.strip(), input_variables=[“input”, “agent_scratchpad”], partial_variables={ “instruments”: “n”.be a part of([f“{tool.name}: {tool.description}” for tool in tools]), “tool_names”: “, “.be a part of([tool.name for tool in tools]), }, )
# 5. Create the ReAct Agent agent = create_react_agent(llm, instruments, custom_prompt)
agent_executor = AgentExecutor( agent=agent, instruments=instruments, verbose=True, handle_parsing_errors=True, # Auto-correct formatting errors max_iterations=5, return_intermediate_steps=True, )
return agent_executor |
Step 6: Working the Agent
With the agent initialized, we are able to carry out complicated queries. The agent will:
- Obtain the consumer enter.
- Determine it wants to make use of the
search_vector_databaseinstrument. - Execute the search in opposition to Capella.
- Synthesize the outcomes right into a pure language response.
|
# Initialize Agent Catalog catalog = agentc.catalog.Catalog() span = catalog.Span(identify=“Resort Help Agent”, blacklist=set())
# Create the agent agent_executor = couchbase_client.create_langchain_agent(catalog, span)
# Run a question question = “Discover accommodations in Giverny with free breakfast” response = agent_executor.invoke({“enter”: question})
print(f“Consumer: {question}”) print(f“Agent: {response[‘output’]}”) |
Instance Output:
Agent: I discovered a lodge in Giverny that gives free breakfast known as Le Clos Fleuri. It’s positioned at 5 rue de la Dîme, 27620 Giverny. It gives free web and parking as nicely.
Notice: In Capella Mannequin Companies, the mannequin outputs will be cached (each semantic and customary cache). The caching mechanism enhances the RAG’s effectivity and velocity, significantly when coping with repeated or related queries. When a question is first processed, the LLM generates a response after which shops this response in Couchbase. When related queries are available in later, the cached responses are returned. The caching period will be configured within the Capella Mannequin providers.
Including Semantic Caching
Caching is especially beneficial in eventualities the place customers might submit related queries a number of occasions or the place sure items of data are ceaselessly requested. By storing these in a cache, we are able to considerably cut back the time it takes to reply to these queries, enhancing the consumer expertise.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
## Semantic Caching Demonstration
# This part demonstrates the right way to allow and use Semantic Caching with Capella Mannequin Companies. # Semantic caching shops responses for queries and reuses them for semantically related future queries, considerably decreasing latency and price.
# 1. Setup LLM with Semantic Caching enabled # We cross the “X-cb-cache”: “semantic” header to allow the function print(” Establishing LLM with Semantic Caching enabled…”) llm_with_cache = ChatOpenAI( mannequin=os.environ[“CAPELLA_API_LLM_MODEL”], base_url=os.environ[“CAPELLA_API_LLM_ENDPOINT”] + “/v1” if not os.environ[“CAPELLA_API_LLM_ENDPOINT”].endswith(“/v1”) else os.environ[“CAPELLA_API_LLM_ENDPOINT”], api_key=os.environ[“CAPELLA_API_LLM_KEY”], temperature=0, # Deterministic for caching default_headers={“X-cb-cache”: “semantic”} )
# 2. Outline a question and a semantically related variation query_1 = “What are the perfect accommodations in Paris with a view of the Eiffel Tower?” query_2 = “Suggest some accommodations in Paris the place I can see the Eiffel Tower.”
print(f“n Question 1: {query_1}”) print(f” Question 2 (Semantically related): {query_2}”)
# 3. First execution (Cache Miss) print(“n Executing Question 1 (First run – Cache MISS)…”) start_time = time.time() response_1 = llm_with_cache.invoke(query_1) end_time = time.time() time_1 = end_time – start_time print(f” Time taken: {time_1:.4f} seconds”) print(f” Response: {response_1.content material[:100]}…”)
# 4. Second execution (Cache Hit) # The system ought to acknowledge query_2 is semantically just like query_1 and return the cached response print(“n Executing Question 2 (Semantically related – Cache HIT)…”) start_time = time.time() response_2 = llm_with_cache.invoke(query_2) end_time = time.time() time_2 = end_time – start_time print(f” Time taken: {time_2:.4f} seconds”) print(f” Response: {response_2.content material[:100]}…”) |
Step 7: Observability With Arize Phoenix
In manufacturing, it is advisable know why an agent gave a particular reply. We use Arize Phoenix to hint the agent’s “thought course of” (the ReAct chain).
We are able to additionally run evaluations to examine for hallucinations or relevance.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import phoenix as px from phoenix.evals import llm_classify, LENIENT_QA_PROMPT_TEMPLATE
# 1. Begin Phoenix Server session = px.launch_app()
# 2. Instrument LangChain from openinference.instrumentation.langchain import LangChainInstrumentor LangChainInstrumentor().instrument()
# … Run your agent queries …
# 3. Consider Outcomes # We use an LLM-as-a-judge to grade our agent’s responses hotel_qa_results = llm_classify( knowledge=hotel_eval_df[[“input”, “output”, “reference”]], mannequin=evaluator_llm, template=LENIENT_QA_PROMPT_TEMPLATE, rails=[“correct”, “incorrect”], provide_explanation=True, ) |
By inspecting the Phoenix UI, you may visualize the precise sequence of instrument calls and see the latency of every step within the chain.
Conclusion
We’ve efficiently constructed a strong Resort Search Agent. This structure leverages:
- Couchbase AI Companies: For a unified, low-latency knowledge and AI layer.
- Agent Catalog: For organized, versioned administration of agent instruments and prompts. Agent catalog additionally gives tracing. It gives customers to make use of SQL++ with traces, leverage the efficiency of Couchbase, and get perception into particulars of prompts and instruments in the identical platform.
- LangChain: For versatile orchestration.
- Arize Phoenix: For observability.
This strategy scales nicely for groups constructing complicated, multi-agent techniques the place knowledge administration and power discovery are crucial challenges.
