Local LLM Setup Guide

Run AI completely free on your own computer

← Back to Documentation
$0
Zero Costs

No per-request API fees

πŸ”’
Privacy

All processing on your machine

πŸ“΄
Offline Capable

No internet required

♾️
No Rate Limits

Process as much as you want

Trade-offs

  • Slower than cloud APIs (5-10x longer processing)
  • Requires decent hardware (GPU with 8GB+ VRAM recommended)

Option 1: Ollama (Recommended)

Ollama is the easiest way to run local LLMs. It just works, runs as a lightweight background service, and takes up almost no resources until you actually need it.

Step 1: Download and Install

  1. Visit ollama.com
  2. Download Ollama for Windows
  3. Install - Ollama starts automatically as a background service

Step 2: Download a Model

Option A: Install from Loki Studio (Easiest)

  1. Open Loki Studio
  2. Go to Settings > Application Settings > LLM Providers
  3. Select Ollama as provider
  4. Click Install next to any model (llama3, mistral, qwen, etc.)
  5. Wait for download to complete (progress shown)
  6. Click Refresh to see installed models

Option B: Install via Command Line

Open Command Prompt or PowerShell and run:

# Best balance of quality and speed (8GB VRAM)
ollama pull llama3:8b

# Faster, smaller footprint (4GB VRAM)
ollama pull llama3.2:3b

# Good alternative (8GB VRAM)
ollama pull mistral:7b

Step 3: Configure Loki Studio

  1. Open Loki Studio
  2. Go to Settings > Application Settings > LLM Providers
  3. Set AI Provider to Ollama
  4. Select your model from the dropdown (auto-detected)
  5. Settings auto-save!

That's it! Ollama runs automatically in the background whenever you need it.

Option 2: LM Studio

LM Studio is a desktop application with a nice UI and model browser. It gives you more control over model settings.

Important: LM Studio requires you to manually start its server each time you want to use it. If you want something that "just works," use Ollama instead.

Step 1: Download and Install

  1. Visit lmstudio.ai
  2. Download LM Studio for Windows
  3. Install and launch the application

Step 2: Download a Model

  1. Open LM Studio
  2. Go to the Search tab (magnifying glass icon)
  3. Search for and download a model:
    • Qwen2.5-7B-Instruct - Best balance (~8GB VRAM)
    • Llama-3.2-3B-Instruct - Faster (~4GB VRAM)
    • Mistral-7B-Instruct - Good alternative (~8GB VRAM)
    • Phi-3-mini - Very fast (~2GB VRAM)
  4. Wait for download to complete

Step 3: Start the Server (Every Time!)

  1. Go to the Local Server tab (leftmost icon)
  2. Select your downloaded model from the dropdown
  3. Click Start Server
  4. Keep LM Studio running while using Loki Studio

Server runs at: http://localhost:1234

Step 4: Configure Loki Studio

  1. Open Loki Studio
  2. Go to Application Settings
  3. Set AI Provider to LM Studio
  4. Set LM Studio Endpoint to http://localhost:1234
  5. Select your model from the dropdown
  6. Click Save Settings

Model Recommendations

Model VRAM Speed Quality Best For
Qwen2.5-7B 8GB Medium Excellent Daily use, best balance
Llama-3.2-3B 4GB Fast Good Budget GPUs
Mistral-7B 8GB Medium Very Good Alternative to Qwen
Phi-3-mini 2GB Very Fast Decent Older GPUs, testing

Troubleshooting

"Connection refused" or timeout errors

  • Ollama: Check service is running with ollama list
  • LM Studio: Make sure server is started (green indicator)
  • Verify endpoint URL matches what's in Loki Studio settings

"Model not found"

  • Ollama: Run ollama list to see installed models
  • LM Studio: Model must be loaded in Local Server tab
  • Model name is case-sensitive - check spelling

Very slow generation

  • Check GPU is being used (Task Manager > Performance > GPU)
  • Try a smaller model (3B instead of 7B)
  • Close other GPU-intensive applications
  • Update GPU drivers

Cost Comparison

Method Cost per Video Notes
Local LLM $0.00 Free after initial setup
OpenAI GPT-4o-mini ~$0.01-0.05 Fast, high quality
OpenAI GPT-4o ~$0.10-0.50 Premium quality

Pro Tip

Process videos overnight with local LLMs. Set up a batch before bed and wake up to fully processed content - all at zero cost. After ~50-100 videos, local LLMs have paid for themselves!

See also: Remote LLM Setup for cloud-based AI options

← Grimnir Setup Remote LLM Setup β†’
β˜• Buy me a coffee