*From local ChatGPT to automated workflows • Complete step-by-step guides • Updated Feb 2026*
If you’ve already installed Ollama or LM Studio and picked a model, you might be wondering: “What do I actually build with this thing?”
You’re not alone. Searches tell us exactly what users want next:
| Search Query | Trend | What It Means |
|---|---|---|
ollama api | -20%* | “How do I connect this to my apps?” |
ollama python | -10%* | Developers want code examples |
docker ollama | +4% | “Can I deploy this to the cloud?” |
ollama webui | -20%* | “Give me a ChatGPT-like interface” |
open webui | -20%* | Same story, different spelling |
n8n | +20% | “Automate my workflows with AI” |
*Note: Negative trends don’t mean low interest—they mean search volume normalized after earlier peaks. These are still highly active, high-intent queries.
This guide gives you 7 ready-to-build projects, from simplest (5 minutes) to most advanced (containerized deployment). Each includes:
- ✅ Exact code/commands
- ✅ Prerequisites
- ✅ Expected outcome
- ✅ Next steps to customize
🧭 Quick Start: Which Project Fits You?
text
What's your goal?
│
├── 🖥️ "I want a ChatGPT-like interface"
│ → Project 1: Local ChatGPT (Open WebUI)
│
├── 💻 "I want AI in VS Code"
│ → Project 2: VS Code Autocomplete (Continue.dev)
│
├── 🔧 "I want to build apps with the API"
│ → Project 3: Python API + FastAPI
│
├── 🤖 "I want to automate workflows"
│ → Project 4: n8n AI Automation (+20% trend!)
│
├── 🐳 "I want to deploy to the cloud"
│ → Project 5: Docker Deployment (+4% trend)
│
├── 📄 "I want to chat with my documents"
│ → Project 6: RAG Document Q&A
│
└── 🎭 "I want roleplay/custom chatbots"
→ Project 7: Custom Chatbots with SillyTavern
🏗️ PROJECT 1: Local ChatGPT (Open WebUI)
Time: 10 minutes • Difficulty: Easy • Trend: ollama webui (steady high intent)
What You’ll Build
A ChatGPT-like web interface that runs completely locally—no internet required after setup. Multiple chats, model switching, and a beautiful UI.
Prerequisites
- ✅ Ollama installed (Setup Guide)
- ✅ At least one model pulled (e.g.,
llama3.2,claude-code:7b)
Step-by-Step
Option A: Docker (Easiest)
bash
# Run Open WebUI with one command docker run -d -p 3000:8080 \ -v open-webui:/app/backend/data \ --name open-webui \ --restart always \ ghcr.io/open-webui/open-webui:main
Option B: Native Install
bash
# Clone the repository git clone https://github.com/open-webui/open-webui.git cd open-webui # Install dependencies pip install -r requirements.txt # Run it python app.py
Connect to Ollama
- Open browser to
http://localhost:3000 - Go to Settings → Connections
- Set Ollama URL to:
http://localhost:11434 - Click “Save”
- Start chatting! 🎉
What You’ll See
text
[Model Selector] ▼ ┌─────────────────────────────────┐ │ │ │ Hello! How can I help today? │ │ │ ├─────────────────────────────────┤ │ Write a Python function... │ └─────────────────────────────────┘
Next Steps
- Enable multi-user support (Settings → Users)
- Add RAG for document uploads (Project 6)
- Customize with themes
🖥️ PROJECT 2: VS Code AI Autocomplete (Continue.dev)
Time: 5 minutes • Difficulty: Easy • Trend: ollama python (steady)
What You’ll Build
A free alternative to GitHub Copilot that uses your local models for code autocomplete, chat, and refactoring—all inside VS Code.
Prerequisites
- ✅ VS Code installed
- ✅ Ollama running
- ✅ A coding model (Claude Code 7B or Qwen 2.5 7B recommended)
Step-by-Step
- Install Continue extensionbash# In VS Code, open Extensions (Ctrl+Shift+X) # Search “Continue” → Install
- Configure Continue
Create or edit~/.continue/config.json:json{ “models”: [ { “title”: “Claude Code”, “provider”: “ollama”, “model”: “claude-code:7b” } ], “tabAutocompleteModel”: { “title”: “Qwen 2.5”, “provider”: “ollama”, “model”: “qwen2.5:7b” } } - Start coding!
- Autocomplete: Just start typing—suggestions appear
- Chat:
Cmd+I(Mac) orCtrl+I(Windows) - Edit: Highlight code →
Cmd+K→ “Make this faster”
Real Example
python
# Type this:
def calculate_fibonacci
# Continue suggests automatically:
def calculate_fibonacci(n: int) -> list:
"""Return first n Fibonacci numbers."""
if n <= 0:
return []
elif n == 1:
return [0]
fib = [0, 1]
for i in range(2, n):
fib.append(fib[i-1] + fib[i-2])
return fib[:n]
Pro Tips
- Use Claude Code 7B for chat/editing
- Use Qwen 2.5 7B for autocomplete (faster)
- Disable in
settings.json:"continue.enableTabAutocomplete": false
🔧 PROJECT 3: Build a Local API (Python + FastAPI)
Time: 15 minutes • Difficulty: Intermediate • Trend: ollama api, ollama python
What You’ll Build
A REST API that serves your local models to any app—web apps, mobile apps, Slack bots, or internal tools.
Prerequisites
- ✅ Python 3.9+ installed
- ✅ Ollama running
- ✅ Basic Python knowledge
Step-by-Step
1. Set up the project
bash
mkdir local-llm-api cd local-llm-api python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate pip install fastapi uvicorn httpx python-dotenv
2. Create main.py
python
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import httpx
import os
from dotenv import load_dotenv
load_dotenv()
app = FastAPI(title="Local LLM API")
OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434")
DEFAULT_MODEL = os.getenv("DEFAULT_MODEL", "llama3.2")
class ChatRequest(BaseModel):
prompt: str
model: str = DEFAULT_MODEL
system: str = "You are a helpful assistant."
temperature: float = 0.7
class ChatResponse(BaseModel):
response: str
model: str
tokens: int = None
@app.get("/")
def root():
return {"message": "Local LLM API", "docs": "/docs"}
@app.get("/models")
async def list_models():
"""List available models from Ollama"""
async with httpx.AsyncClient() as client:
response = await client.get(f"{OLLAMA_URL}/api/tags")
return response.json()
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
"""Generate a response from the local LLM"""
payload = {
"model": request.model,
"prompt": request.prompt,
"system": request.system,
"stream": False,
"options": {
"temperature": request.temperature
}
}
try:
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
f"{OLLAMA_URL}/api/generate",
json=payload
)
result = response.json()
return ChatResponse(
response=result["response"],
model=request.model,
tokens=result.get("eval_count")
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
3. Create .env
text
OLLAMA_URL=http://localhost:11434 DEFAULT_MODEL=claude-code:7b
4. Run it
bash
python main.py
Test Your API
List models:
bash
curl http://localhost:8000/models
Chat:
bash
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"prompt": "Write a haiku about coding", "model": "claude-code:7b"}'
Build Something With It
Slack Bot Example (simplified):
python
from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler
import httpx
app = App(token="YOUR_BOT_TOKEN")
@app.message(".*")
def handle_message(message, say):
# Call your local API
response = httpx.post("http://localhost:8000/chat", json={
"prompt": message["text"]
})
say(response.json()["response"])
if __name__ == "__main__":
handler = SocketModeHandler(app, "YOUR_APP_TOKEN")
handler.start()
🤖 PROJECT 4: n8n AI Automation (+20% Trend)
Time: 20 minutes • Difficulty: Intermediate • Trend: n8n (+20%!)
What You’ll Build
Automated workflows connecting your local LLM to 300+ apps—Gmail, Slack, Telegram, Google Sheets, databases, and more. This is the hottest trend in the data (+20% growth).
Prerequisites
- ✅ Docker installed (or Node.js)
- ✅ Ollama running
- ✅ n8n account (free, self-hosted)
Step-by-Step
1. Run n8n with Docker
bash
docker run -it --rm \ --name n8n \ -p 5678:5678 \ -v n8n_data:/home/node/.n8n \ n8nio/n8n
Access at http://localhost:5678
2. Add Ollama Credentials
- Go to Settings → Credentials
- Click “Add Credential”
- Choose “Ollama API”
- Set:
- Base URL:
http://host.docker.internal:11434(Mac/Windows) - Base URL (Linux):
http://172.17.0.1:11434
- Base URL:
- Test connection → Save
3. Build Your First Workflow
Example: Email Summarizer Bot
- Trigger: “On incoming email” (Gmail node)
- Process: Extract email body
- AI: Ollama node with prompt:textSummarize this email in 3 bullet points: {{email_body}}
- Action: Send summary to Slack/Telegram
Workflow JSON (import this):
json
{
"nodes": [
{
"name": "Gmail Trigger",
"type": "n8n-nodes-base.gmailTrigger",
"position": [250, 300]
},
{
"name": "Ollama Summarize",
"type": "n8n-nodes-base.ollama",
"position": [450, 300],
"parameters": {
"model": "claude-code:7b",
"prompt": "Summarize this email concisely:\n\n{{$json.body}}",
"options": {
"temperature": 0.3
}
}
},
{
"name": "Slack",
"type": "n8n-nodes-base.slack",
"position": [650, 300],
"parameters": {
"channel": "#ai-summaries",
"text": "📧 Summary:\n{{$node['Ollama Summarize'].json.response}}"
}
}
]
}
5 More n8n Ideas
| Workflow | What It Does |
|---|---|
| Meeting Note Taker | Transcribe Zoom/Meet recordings → Summarize → Send to Notion |
| Support Ticket Classifier | New Zendesk tickets → Categorize → Assign to right team |
| Social Media Responder | Monitor Twitter mentions → Generate replies → Queue for approval |
| Daily News Brief | Scrape RSS feeds → Summarize → Email digest |
| Database Q&A | Query PostgreSQL → Generate reports in plain English |
🐳 PROJECT 5: Docker Deployment (+4% Trend)
Time: 15 minutes • Difficulty: Intermediate • Trend: docker ollama (+4%)
What You’ll Build
A containerized Ollama that you can deploy anywhere—cloud VPS, home server, or scale horizontally.
Why Docker?
- ✅ Portable: Run anywhere (AWS, DigitalOcean, Raspberry Pi)
- ✅ Isolated: No dependency conflicts
- ✅ Scalable: Run multiple models, load balance
- ✅ Backup: Version your models
Step-by-Step
1. Basic Docker Setup
bash
# Pull official Ollama image docker pull ollama/ollama # Run with volume for models docker run -d \ -v ollama:/root/.ollama \ -p 11434:11434 \ --name ollama \ ollama/ollama
2. Pull Models Inside Container
bash
# Enter container docker exec -it ollama bash # Pull models ollama pull llama3.2 ollama pull claude-code:7b # Exit exit
3. Docker Compose (Recommended)
Create docker-compose.yml:
yaml
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama_data:/root/.ollama
ports:
- "11434:11434"
restart: unless-stopped
healthcheck:
test: ["CMD", "ollama", "list"]
interval: 30s
timeout: 10s
retries: 3
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
depends_on:
- ollama
volumes:
- open-webui_data:/app/backend/data
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
restart: unless-stopped
volumes:
ollama_data:
open-webui_data:
Run it:
bash
docker-compose up -d
Access:
- Open WebUI:
http://localhost:3000 - Ollama API:
http://localhost:11434
Deploy to Cloud (DigitalOcean Example)
- Create Ubuntu droplet (minimum 4GB RAM, 8GB recommended)
- Install Docker:bashcurl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER
- Copy
docker-compose.ymlto server - Run:bashdocker-compose up -d
- Access via your server IP (port 3000)
Security note: Add SSL/nginx for production.
📄 PROJECT 6: RAG Document Q&A
Time: 30 minutes • Difficulty: Advanced • Trend: Steady, high intent
What You’ll Build
A system that lets you chat with your documents—PDFs, Word files, websites, or codebases—using local LLMs and embeddings.
Prerequisites
- ✅ Ollama running
- ✅ Python 3.9+
- ✅ A model with good context (Claude Code or Qwen 2.5 recommended)
Step-by-Step
1. Install Dependencies
bash
pip install chromadb sentence-transformers pypdf2 langchain langchain-community
2. Create rag-chat.py
python
import os
from langchain.document_loaders import PyPDFLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import Ollama
from langchain.chains import RetrievalQA
# Configuration
DOCS_FOLDER = "./docs" # Put your documents here
PERSIST_DIR = "./chroma_db"
MODEL_NAME = "claude-code:7b"
EMBEDDING_MODEL = "all-MiniLM-L6-v2"
# 1. Load documents
print("📚 Loading documents...")
documents = []
for file in os.listdir(DOCS_FOLDER):
if file.endswith('.pdf'):
loader = PyPDFLoader(os.path.join(DOCS_FOLDER, file))
documents.extend(loader.load())
elif file.endswith('.txt'):
loader = TextLoader(os.path.join(DOCS_FOLDER, file))
documents.extend(loader.load())
# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)
print(f"✅ Created {len(chunks)} chunks")
# 3. Create embeddings
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL)
# 4. Create vector store
if os.path.exists(PERSIST_DIR):
vectordb = Chroma(persist_directory=PERSIST_DIR, embedding_function=embeddings)
else:
vectordb = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory=PERSIST_DIR
)
vectordb.persist()
# 5. Create QA chain
llm = Ollama(model=MODEL_NAME)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectordb.as_retriever(search_kwargs={"k": 3}),
return_source_documents=True
)
# 6. Chat loop
print("\n🤖 Ready! Ask questions about your documents (type 'quit' to exit)\n")
while True:
query = input("Question: ")
if query.lower() == 'quit':
break
result = qa_chain({"query": query})
print(f"\nAnswer: {result['result']}")
print(f"Sources: {[doc.metadata['source'] for doc in result['source_documents']]}\n")
3. Use It
bash
# Create docs folder and add your files mkdir docs # Copy your PDFs/txt files into docs/ python rag-chat.py
Example
Question: “What were the main findings in chapter 3?”
Answer: “Chapter 3 found that…”
Sources: ['research-paper.pdf']
🎭 PROJECT 7: Custom Chatbots with SillyTavern
Time: 15 minutes • Difficulty: Easy • Trend: Steady, passionate community
What You’ll Build
A feature-rich chatbot interface for roleplay, character creation, and storytelling—perfect for creative writing, game NPCs, or just having fun with AI.
Prerequisites
- ✅ Ollama or LM Studio running
- ✅ A model (Llama 3.2 or Claude Code works well)
Step-by-Step
1. Install SillyTavern
bash
git clone https://github.com/SillyTavern/SillyTavern.git cd SillyTavern npm install
2. Configure for Ollama
Edit config.yaml:
yaml
api: type: "ollama" url: "http://localhost:11434" model: "claude-code:7b"
3. Run It
bash
node server.js
Open http://localhost:8000
Create Your First Character
- Click “Character Management”
- “Create New Character”
- Fill in:
- Name: “Sherlock Holmes”
- Context: “You are Sherlock Holmes, the world’s greatest detective. You speak in a formal, observant manner and notice tiny details others miss.”
- Example dialogue:textUser: What do you see? Sherlock: Elementary, my dear user. The faint smudge of ink on your finger suggests you’ve been writing, while the slight wear on your keyboard tells me you’re a frequent typist.
- Save and start chatting!
Pro Features
- Group chats: Multiple characters talking
- World info: Define lore, settings, rules
- Extensions: Text-to-speech, image generation
- Mobile-friendly: Works on phones
📊 Project Comparison: Which Should You Build First?
| Project | Time | Difficulty | Best For | Trend |
|---|---|---|---|---|
| 1. Open WebUI | 10 min | Easy | ChatGPT-like interface | High intent |
| 2. VS Code AI | 5 min | Easy | Developers, coders | Steady |
| 3. Python API | 15 min | Intermediate | App builders | ollama python |
| 4. n8n Automation | 20 min | Intermediate | Automation lovers | +20% 🚀 |
| 5. Docker | 15 min | Intermediate | Deployers | +4% 📈 |
| 6. RAG Q&A | 30 min | Advanced | Document processors | Steady |
| 7. SillyTavern | 15 min | Easy | Creatives, roleplay | Passionate |
Frequently Asked Questions
Q: Do these projects work with LM Studio too?
A: Most do! For Open WebUI and n8n, point them to LM Studio’s local API (usually http://localhost:1234). For VS Code, Continue works with LM Studio’s OpenAI-compatible mode.
Q: I’m not a developer—which project should I start with?
A: Project 1 (Open WebUI) is the most user-friendly. It’s a beautiful interface that requires zero coding.
Q: Can I run multiple projects at once?
A: Yes! Ollama can serve models to many clients simultaneously. Your API, Open WebUI, and n8n can all connect to the same Ollama instance.
Q: How much RAM do I need for these projects?
A:
- Basic: 8GB RAM → Open WebUI + small model
- Advanced: 16GB+ RAM → Multiple projects + larger models
- Production: 32GB+ RAM → Docker + RAG + API
Q: I’m stuck on a project. Where can I get help?
A:
- Ollama: GitHub Discussions
- Open WebUI: Discord
- n8n: Community Forum
- Continue: Discord