In this guide, I’ll walk you through how to build a fully local AI agent in minutes using:
- 🐍 Python
- 🦙 Ollama
- 🔗 LangChain
- 🧠 ChromaDB
The best part? No API keys, no cloud—everything runs 100% locally.
What We’re Building
A retrieval-augmented generation (RAG) agent that:
✅ Answers questions from a CSV (e.g. restaurant reviews)
✅ Uses local vector search to retrieve relevant data
✅ Generates natural language answers using a local LLM
✅ Runs entirely offline using open-source models
Demo Preview
Imagine you have a file like reviews.csv
with pizza restaurant feedback.
You can ask:
- “How’s the pizza quality?”
- “Do they offer vegan options?”
The agent fetches the most relevant reviews and generates insightful answers—powered by a local LLM.
Tools & Setup
1. Install Dependencies
Create a virtual environment and install required packages:
python -m venv venv
source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windows
pip install langchain langchain-ollama langchain-chroma pandas
2. Download Ollama & Models
Install Ollama (https://ollama.com) if you haven’t, then pull models:
ollama pull llama3:8b # Main LLM
ollama pull mxbai-embed-large # Embedding model
3. Prepare Your Data
Add your reviews.csv
file to the project folder.
Example structure:
title | review | rating | date |
---|---|---|---|
Great Pizza! | The crust was crispy and cheese melted… | 5 | 2024-01-15 |
Not so vegan… | Vegan cheese had a weird aftertaste | 2 | 2024-02-10 |
🧠 Step 1: Load & Vectorize Data
Create a file: vector.py
# vector.py
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.documents import Document
import pandas as pd
# Load CSV
df = pd.read_csv("reviews.csv")
# Create LangChain documents
documents = [
Document(
page_content=f"{row['title']} {row['review']}",
metadata={"rating": row["rating"], "date": row["date"]}
)
for _, row in df.iterrows()
]
# Embed & store in ChromaDB
vector_store = Chroma.from_documents(
documents=documents,
embedding=OllamaEmbeddings(model="mxbai-embed-large"),
persist_directory="./chroma_db"
)
# Persist to disk
vector_store.persist()
🧠 Step 2: Build the AI Agent
Create a file: main.py
# main.py
from langchain_community.llms import Ollama
from langchain_core.prompts import ChatPromptTemplate
from vector import vector_store
# Load the local model
llm = Ollama(model="llama3:8b")
# Set up prompt template
template = """You are a helpful assistant analyzing pizza restaurant reviews.
Relevant reviews:
{reviews}
Question:
{question}"""
prompt = ChatPromptTemplate.from_template(template)
# Vector retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 5})
# Define the chain: prompt -> LLM
chain = prompt | llm
# CLI loop
while True:
question = input("Ask a question (Q to quit): ")
if question.lower() == "q":
break
# Retrieve relevant content
docs = retriever.invoke(question)
context = "\n".join(doc.page_content for doc in docs)
# Get answer
answer = chain.invoke({"reviews": context, "question": question})
print("\n🧠 AI Response:\n", answer, "\n")
🧪 Example Output
Question: “Are there vegan options?”
AI Response:
“Several reviews mention vegan options. While one called the vegan pizza a ‘hidden gem’, another criticized the vegan cheese. Overall sentiment: mixed.”
🎯 Why This Matters
✅ Runs Fully Offline
✅ Customizable & Private
✅ Fast Prototyping with Real Local AI
You can even extend it to work with:
- PDFs
- Webpages
- Entire folders or databases
📦 Next Steps
- 🔁 Swap out
llama3
formistral
orgemma
in Ollama. - 📄 Use
langchain.document_loaders
for PDFs or websites. - 💬 Build a UI with Streamlit or Gradio.