How to Build a Local AI Agent with Python (Ollama, LangChain & RAG)

In this guide, I’ll walk you through how to build a fully local AI agent in minutes using:

  • 🐍 Python
  • 🦙 Ollama
  • 🔗 LangChain
  • 🧠 ChromaDB

The best part? No API keys, no cloud—everything runs 100% locally.


What We’re Building

A retrieval-augmented generation (RAG) agent that:

✅ Answers questions from a CSV (e.g. restaurant reviews)
✅ Uses local vector search to retrieve relevant data
✅ Generates natural language answers using a local LLM
✅ Runs entirely offline using open-source models


Demo Preview

Imagine you have a file like reviews.csv with pizza restaurant feedback.

You can ask:

  • “How’s the pizza quality?”
  • “Do they offer vegan options?”

The agent fetches the most relevant reviews and generates insightful answers—powered by a local LLM.


Tools & Setup

1. Install Dependencies

Create a virtual environment and install required packages:

python -m venv venv
source venv/bin/activate  # Mac/Linux
venv\Scripts\activate     # Windows

pip install langchain langchain-ollama langchain-chroma pandas

2. Download Ollama & Models

Install Ollama (https://ollama.com) if you haven’t, then pull models:

ollama pull llama3:8b           # Main LLM
ollama pull mxbai-embed-large   # Embedding model

3. Prepare Your Data

Add your reviews.csv file to the project folder.
Example structure:

titlereviewratingdate
Great Pizza!The crust was crispy and cheese melted…52024-01-15
Not so vegan…Vegan cheese had a weird aftertaste22024-02-10

🧠 Step 1: Load & Vectorize Data

Create a file: vector.py

# vector.py
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
import pandas as pd

# Load CSV
df = pd.read_csv("reviews.csv")

# Create LangChain documents
documents = [
    Document(
        page_content=f"{row['title']} {row['review']}",
        metadata={"rating": row["rating"], "date": row["date"]}
    )
    for _, row in df.iterrows()
]

# Embed & store in ChromaDB
vector_store = Chroma.from_documents(
    documents=documents,
    embedding=OllamaEmbeddings(model="mxbai-embed-large"),
    persist_directory="./chroma_db"
)

# Persist to disk
vector_store.persist()

🧠 Step 2: Build the AI Agent

Create a file: main.py

# main.py
from langchain_community.llms import Ollama
from langchain_core.prompts import ChatPromptTemplate
from vector import vector_store

# Load the local model
llm = Ollama(model="llama3:8b")

# Set up prompt template
template = """You are a helpful assistant analyzing pizza restaurant reviews.
Relevant reviews:
{reviews}

Question:
{question}"""

prompt = ChatPromptTemplate.from_template(template)

# Vector retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 5})

# Define the chain: prompt -> LLM
chain = prompt | llm

# CLI loop
while True:
    question = input("Ask a question (Q to quit): ")
    if question.lower() == "q":
        break

    # Retrieve relevant content
    docs = retriever.invoke(question)
    context = "\n".join(doc.page_content for doc in docs)

    # Get answer
    answer = chain.invoke({"reviews": context, "question": question})
    print("\n🧠 AI Response:\n", answer, "\n")

🧪 Example Output

Question: “Are there vegan options?”
AI Response:

“Several reviews mention vegan options. While one called the vegan pizza a ‘hidden gem’, another criticized the vegan cheese. Overall sentiment: mixed.”


🎯 Why This Matters

Runs Fully Offline
Customizable & Private
Fast Prototyping with Real Local AI

You can even extend it to work with:

  • PDFs
  • Webpages
  • Entire folders or databases

📦 Next Steps

  • 🔁 Swap out llama3 for mistral or gemma in Ollama.
  • 📄 Use langchain.document_loaders for PDFs or websites.
  • 💬 Build a UI with Streamlit or Gradio.

Full Code on GitHub

🔗 Full Code on GitHub

Leave a Reply

x
Advertisements