What to Build with Local LLMs: 7 Practical Projects for Ollama & LM Studio

*From local ChatGPT to automated workflows • Complete step-by-step guides • Updated Feb 2026*


If you’ve already installed Ollama or LM Studio and picked a model, you might be wondering: “What do I actually build with this thing?”

You’re not alone. Searches tell us exactly what users want next:

Search QueryTrendWhat It Means
ollama api-20%*“How do I connect this to my apps?”
ollama python-10%*Developers want code examples
docker ollama+4%“Can I deploy this to the cloud?”
ollama webui-20%*“Give me a ChatGPT-like interface”
open webui-20%*Same story, different spelling
n8n+20%“Automate my workflows with AI”

*Note: Negative trends don’t mean low interest—they mean search volume normalized after earlier peaks. These are still highly active, high-intent queries.

This guide gives you 7 ready-to-build projects, from simplest (5 minutes) to most advanced (containerized deployment). Each includes:

  • ✅ Exact code/commands
  • ✅ Prerequisites
  • ✅ Expected outcome
  • ✅ Next steps to customize

🧭 Quick Start: Which Project Fits You?

text

What's your goal?
│
├── 🖥️ "I want a ChatGPT-like interface"
│   → Project 1: Local ChatGPT (Open WebUI)
│
├── 💻 "I want AI in VS Code"
│   → Project 2: VS Code Autocomplete (Continue.dev)
│
├── 🔧 "I want to build apps with the API"
│   → Project 3: Python API + FastAPI
│
├── 🤖 "I want to automate workflows"
│   → Project 4: n8n AI Automation (+20% trend!)
│
├── 🐳 "I want to deploy to the cloud"
│   → Project 5: Docker Deployment (+4% trend)
│
├── 📄 "I want to chat with my documents"
│   → Project 6: RAG Document Q&A
│
└── 🎭 "I want roleplay/custom chatbots"
    → Project 7: Custom Chatbots with SillyTavern

🏗️ PROJECT 1: Local ChatGPT (Open WebUI)

Time: 10 minutes • Difficulty: Easy • Trend: ollama webui (steady high intent)

What You’ll Build

A ChatGPT-like web interface that runs completely locally—no internet required after setup. Multiple chats, model switching, and a beautiful UI.

Prerequisites

  • ✅ Ollama installed (Setup Guide)
  • ✅ At least one model pulled (e.g., llama3.2, claude-code:7b)

Step-by-Step

Option A: Docker (Easiest)

bash

# Run Open WebUI with one command
docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

Option B: Native Install

bash

# Clone the repository
git clone https://github.com/open-webui/open-webui.git
cd open-webui

# Install dependencies
pip install -r requirements.txt

# Run it
python app.py

Connect to Ollama

  1. Open browser to http://localhost:3000
  2. Go to Settings → Connections
  3. Set Ollama URL to: http://localhost:11434
  4. Click “Save”
  5. Start chatting! 🎉

What You’ll See

text

[Model Selector] ▼
┌─────────────────────────────────┐
│                                 │
│  Hello! How can I help today?   │
│                                 │
├─────────────────────────────────┤
│ Write a Python function...      │
└─────────────────────────────────┘

Next Steps

  • Enable multi-user support (Settings → Users)
  • Add RAG for document uploads (Project 6)
  • Customize with themes

🖥️ PROJECT 2: VS Code AI Autocomplete (Continue.dev)

Time: 5 minutes • Difficulty: Easy • Trend: ollama python (steady)

What You’ll Build

A free alternative to GitHub Copilot that uses your local models for code autocomplete, chat, and refactoring—all inside VS Code.

Prerequisites

  • ✅ VS Code installed
  • ✅ Ollama running
  • ✅ A coding model (Claude Code 7B or Qwen 2.5 7B recommended)

Step-by-Step

  1. Install Continue extensionbash# In VS Code, open Extensions (Ctrl+Shift+X) # Search “Continue” → Install
  2. Configure Continue
    Create or edit ~/.continue/config.json:json{ “models”: [ { “title”: “Claude Code”, “provider”: “ollama”, “model”: “claude-code:7b” } ], “tabAutocompleteModel”: { “title”: “Qwen 2.5”, “provider”: “ollama”, “model”: “qwen2.5:7b” } }
  3. Start coding!
    • Autocomplete: Just start typing—suggestions appear
    • Chat: Cmd+I (Mac) or Ctrl+I (Windows)
    • Edit: Highlight code → Cmd+K → “Make this faster”

Real Example

python

# Type this:
def calculate_fibonacci

# Continue suggests automatically:
def calculate_fibonacci(n: int) -> list:
    """Return first n Fibonacci numbers."""
    if n <= 0:
        return []
    elif n == 1:
        return [0]
    
    fib = [0, 1]
    for i in range(2, n):
        fib.append(fib[i-1] + fib[i-2])
    return fib[:n]

Pro Tips

  • Use Claude Code 7B for chat/editing
  • Use Qwen 2.5 7B for autocomplete (faster)
  • Disable in settings.json: "continue.enableTabAutocomplete": false

🔧 PROJECT 3: Build a Local API (Python + FastAPI)

Time: 15 minutes • Difficulty: Intermediate • Trend: ollama api, ollama python

What You’ll Build

A REST API that serves your local models to any app—web apps, mobile apps, Slack bots, or internal tools.

Prerequisites

  • ✅ Python 3.9+ installed
  • ✅ Ollama running
  • ✅ Basic Python knowledge

Step-by-Step

1. Set up the project

bash

mkdir local-llm-api
cd local-llm-api
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install fastapi uvicorn httpx python-dotenv

2. Create main.py

python

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import httpx
import os
from dotenv import load_dotenv

load_dotenv()

app = FastAPI(title="Local LLM API")

OLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434")
DEFAULT_MODEL = os.getenv("DEFAULT_MODEL", "llama3.2")

class ChatRequest(BaseModel):
    prompt: str
    model: str = DEFAULT_MODEL
    system: str = "You are a helpful assistant."
    temperature: float = 0.7

class ChatResponse(BaseModel):
    response: str
    model: str
    tokens: int = None

@app.get("/")
def root():
    return {"message": "Local LLM API", "docs": "/docs"}

@app.get("/models")
async def list_models():
    """List available models from Ollama"""
    async with httpx.AsyncClient() as client:
        response = await client.get(f"{OLLAMA_URL}/api/tags")
        return response.json()

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    """Generate a response from the local LLM"""
    payload = {
        "model": request.model,
        "prompt": request.prompt,
        "system": request.system,
        "stream": False,
        "options": {
            "temperature": request.temperature
        }
    }
    
    try:
        async with httpx.AsyncClient(timeout=60.0) as client:
            response = await client.post(
                f"{OLLAMA_URL}/api/generate", 
                json=payload
            )
            result = response.json()
            
            return ChatResponse(
                response=result["response"],
                model=request.model,
                tokens=result.get("eval_count")
            )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

3. Create .env

text

OLLAMA_URL=http://localhost:11434
DEFAULT_MODEL=claude-code:7b

4. Run it

bash

python main.py

Test Your API

List models:

bash

curl http://localhost:8000/models

Chat:

bash

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Write a haiku about coding", "model": "claude-code:7b"}'

Build Something With It

Slack Bot Example (simplified):

python

from slack_bolt import App
from slack_bolt.adapter.socket_mode import SocketModeHandler
import httpx

app = App(token="YOUR_BOT_TOKEN")

@app.message(".*")
def handle_message(message, say):
    # Call your local API
    response = httpx.post("http://localhost:8000/chat", json={
        "prompt": message["text"]
    })
    say(response.json()["response"])

if __name__ == "__main__":
    handler = SocketModeHandler(app, "YOUR_APP_TOKEN")
    handler.start()

🤖 PROJECT 4: n8n AI Automation (+20% Trend)

Time: 20 minutes • Difficulty: Intermediate • Trend: n8n (+20%!)

What You’ll Build

Automated workflows connecting your local LLM to 300+ apps—Gmail, Slack, Telegram, Google Sheets, databases, and more. This is the hottest trend in the data (+20% growth).

Prerequisites

  • ✅ Docker installed (or Node.js)
  • ✅ Ollama running
  • ✅ n8n account (free, self-hosted)

Step-by-Step

1. Run n8n with Docker

bash

docker run -it --rm \
  --name n8n \
  -p 5678:5678 \
  -v n8n_data:/home/node/.n8n \
  n8nio/n8n

Access at http://localhost:5678

2. Add Ollama Credentials

  • Go to SettingsCredentials
  • Click “Add Credential”
  • Choose “Ollama API”
  • Set:
    • Base URL: http://host.docker.internal:11434 (Mac/Windows)
    • Base URL (Linux): http://172.17.0.1:11434
  • Test connection → Save

3. Build Your First Workflow

Example: Email Summarizer Bot

  1. Trigger: “On incoming email” (Gmail node)
  2. Process: Extract email body
  3. AI: Ollama node with prompt:textSummarize this email in 3 bullet points: {{email_body}}
  4. Action: Send summary to Slack/Telegram

Workflow JSON (import this):

json

{
  "nodes": [
    {
      "name": "Gmail Trigger",
      "type": "n8n-nodes-base.gmailTrigger",
      "position": [250, 300]
    },
    {
      "name": "Ollama Summarize",
      "type": "n8n-nodes-base.ollama",
      "position": [450, 300],
      "parameters": {
        "model": "claude-code:7b",
        "prompt": "Summarize this email concisely:\n\n{{$json.body}}",
        "options": {
          "temperature": 0.3
        }
      }
    },
    {
      "name": "Slack",
      "type": "n8n-nodes-base.slack",
      "position": [650, 300],
      "parameters": {
        "channel": "#ai-summaries",
        "text": "📧 Summary:\n{{$node['Ollama Summarize'].json.response}}"
      }
    }
  ]
}

5 More n8n Ideas

WorkflowWhat It Does
Meeting Note TakerTranscribe Zoom/Meet recordings → Summarize → Send to Notion
Support Ticket ClassifierNew Zendesk tickets → Categorize → Assign to right team
Social Media ResponderMonitor Twitter mentions → Generate replies → Queue for approval
Daily News BriefScrape RSS feeds → Summarize → Email digest
Database Q&AQuery PostgreSQL → Generate reports in plain English

🐳 PROJECT 5: Docker Deployment (+4% Trend)

Time: 15 minutes • Difficulty: Intermediate • Trend: docker ollama (+4%)

What You’ll Build

A containerized Ollama that you can deploy anywhere—cloud VPS, home server, or scale horizontally.

Why Docker?

  • Portable: Run anywhere (AWS, DigitalOcean, Raspberry Pi)
  • Isolated: No dependency conflicts
  • Scalable: Run multiple models, load balance
  • Backup: Version your models

Step-by-Step

1. Basic Docker Setup

bash

# Pull official Ollama image
docker pull ollama/ollama

# Run with volume for models
docker run -d \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

2. Pull Models Inside Container

bash

# Enter container
docker exec -it ollama bash

# Pull models
ollama pull llama3.2
ollama pull claude-code:7b

# Exit
exit

3. Docker Compose (Recommended)

Create docker-compose.yml:

yaml

version: '3.8'

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama_data:/root/.ollama
    ports:
      - "11434:11434"
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "ollama", "list"]
      interval: 30s
      timeout: 10s
      retries: 3

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    depends_on:
      - ollama
    volumes:
      - open-webui_data:/app/backend/data
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    restart: unless-stopped

volumes:
  ollama_data:
  open-webui_data:

Run it:

bash

docker-compose up -d

Access:

  • Open WebUI: http://localhost:3000
  • Ollama API: http://localhost:11434

Deploy to Cloud (DigitalOcean Example)

  1. Create Ubuntu droplet (minimum 4GB RAM, 8GB recommended)
  2. Install Docker:bashcurl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER
  3. Copy docker-compose.yml to server
  4. Run:bashdocker-compose up -d
  5. Access via your server IP (port 3000)

Security note: Add SSL/nginx for production.


📄 PROJECT 6: RAG Document Q&A

Time: 30 minutes • Difficulty: Advanced • Trend: Steady, high intent

What You’ll Build

A system that lets you chat with your documents—PDFs, Word files, websites, or codebases—using local LLMs and embeddings.

Prerequisites

  • ✅ Ollama running
  • ✅ Python 3.9+
  • ✅ A model with good context (Claude Code or Qwen 2.5 recommended)

Step-by-Step

1. Install Dependencies

bash

pip install chromadb sentence-transformers pypdf2 langchain langchain-community

2. Create rag-chat.py

python

import os
from langchain.document_loaders import PyPDFLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import Ollama
from langchain.chains import RetrievalQA

# Configuration
DOCS_FOLDER = "./docs"  # Put your documents here
PERSIST_DIR = "./chroma_db"
MODEL_NAME = "claude-code:7b"
EMBEDDING_MODEL = "all-MiniLM-L6-v2"

# 1. Load documents
print("📚 Loading documents...")
documents = []
for file in os.listdir(DOCS_FOLDER):
    if file.endswith('.pdf'):
        loader = PyPDFLoader(os.path.join(DOCS_FOLDER, file))
        documents.extend(loader.load())
    elif file.endswith('.txt'):
        loader = TextLoader(os.path.join(DOCS_FOLDER, file))
        documents.extend(loader.load())

# 2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)
print(f"✅ Created {len(chunks)} chunks")

# 3. Create embeddings
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL)

# 4. Create vector store
if os.path.exists(PERSIST_DIR):
    vectordb = Chroma(persist_directory=PERSIST_DIR, embedding_function=embeddings)
else:
    vectordb = Chroma.from_documents(
        documents=chunks,
        embedding=embeddings,
        persist_directory=PERSIST_DIR
    )
    vectordb.persist()

# 5. Create QA chain
llm = Ollama(model=MODEL_NAME)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectordb.as_retriever(search_kwargs={"k": 3}),
    return_source_documents=True
)

# 6. Chat loop
print("\n🤖 Ready! Ask questions about your documents (type 'quit' to exit)\n")
while True:
    query = input("Question: ")
    if query.lower() == 'quit':
        break
    
    result = qa_chain({"query": query})
    print(f"\nAnswer: {result['result']}")
    print(f"Sources: {[doc.metadata['source'] for doc in result['source_documents']]}\n")

3. Use It

bash

# Create docs folder and add your files
mkdir docs
# Copy your PDFs/txt files into docs/

python rag-chat.py

Example

Question: “What were the main findings in chapter 3?”
Answer: “Chapter 3 found that…”
Sources: ['research-paper.pdf']


🎭 PROJECT 7: Custom Chatbots with SillyTavern

Time: 15 minutes • Difficulty: Easy • Trend: Steady, passionate community

What You’ll Build

A feature-rich chatbot interface for roleplay, character creation, and storytelling—perfect for creative writing, game NPCs, or just having fun with AI.

Prerequisites

  • ✅ Ollama or LM Studio running
  • ✅ A model (Llama 3.2 or Claude Code works well)

Step-by-Step

1. Install SillyTavern

bash

git clone https://github.com/SillyTavern/SillyTavern.git
cd SillyTavern
npm install

2. Configure for Ollama

Edit config.yaml:

yaml

api:
  type: "ollama"
  url: "http://localhost:11434"
  model: "claude-code:7b"

3. Run It

bash

node server.js

Open http://localhost:8000

Create Your First Character

  1. Click “Character Management”
  2. “Create New Character”
  3. Fill in:
    • Name: “Sherlock Holmes”
    • Context: “You are Sherlock Holmes, the world’s greatest detective. You speak in a formal, observant manner and notice tiny details others miss.”
    • Example dialogue:textUser: What do you see? Sherlock: Elementary, my dear user. The faint smudge of ink on your finger suggests you’ve been writing, while the slight wear on your keyboard tells me you’re a frequent typist.
  4. Save and start chatting!

Pro Features

  • Group chats: Multiple characters talking
  • World info: Define lore, settings, rules
  • Extensions: Text-to-speech, image generation
  • Mobile-friendly: Works on phones

📊 Project Comparison: Which Should You Build First?

ProjectTimeDifficultyBest ForTrend
1. Open WebUI10 minEasyChatGPT-like interfaceHigh intent
2. VS Code AI5 minEasyDevelopers, codersSteady
3. Python API15 minIntermediateApp buildersollama python
4. n8n Automation20 minIntermediateAutomation lovers+20% 🚀
5. Docker15 minIntermediateDeployers+4% 📈
6. RAG Q&A30 minAdvancedDocument processorsSteady
7. SillyTavern15 minEasyCreatives, roleplayPassionate

Frequently Asked Questions

Q: Do these projects work with LM Studio too?
A: Most do! For Open WebUI and n8n, point them to LM Studio’s local API (usually http://localhost:1234). For VS Code, Continue works with LM Studio’s OpenAI-compatible mode.

Q: I’m not a developer—which project should I start with?
A: Project 1 (Open WebUI) is the most user-friendly. It’s a beautiful interface that requires zero coding.

Q: Can I run multiple projects at once?
A: Yes! Ollama can serve models to many clients simultaneously. Your API, Open WebUI, and n8n can all connect to the same Ollama instance.

Q: How much RAM do I need for these projects?
A:

  • Basic: 8GB RAM → Open WebUI + small model
  • Advanced: 16GB+ RAM → Multiple projects + larger models
  • Production: 32GB+ RAM → Docker + RAG + API

Q: I’m stuck on a project. Where can I get help?
A: