Grimnir - Built-in LLM

Zero-configuration AI that just works

← Back to Documentation

What is Grimnir?

Grimnir (named after Odin's "Masked One" persona) is Loki Studio's built-in AI engine. It runs large language models directly on your computer using your GPU.

No API keys. No external services. Just download a model and go.

$0
Zero Cost

No API fees ever

🔒
Complete Privacy

All processing local

📴
Offline Capable

Works without internet

GPU Accelerated

Fast inference

Quick Start

Step 1: Download a Model

  1. Go to Application Settings > LLM Providers
  2. Select Grimnir (Built-in LLM) mode
  3. Click any Download button to get a model
  4. Wait for download to complete

Step 2: Select Your Model

The downloaded model will appear in the dropdown. Select it and you're ready to go!

Step 3: Use It

Go to the Metadata tab and generate titles, descriptions, tags - all powered by your local model. No API keys or external services needed.

Available Models

Model Size VRAM Best For
llama3.2:3b ~2GB 4GB Fast, good for most tasks
qwen2.5-coder:7b ~5GB 8GB Code/technical content
mistral:7b ~4GB 8GB High quality alternative
phi3:mini ~2GB 4GB Smallest, fastest
gpt-oss:20b ~12GB 16GB+ Highest quality

VRAM Requirements

Your VRAM Recommended Models
4GB phi3:mini, llama3.2:3b
8GB qwen2.5-coder:7b, mistral:7b
12GB DeepSeek-R1-Distill-Qwen-7B
16GB+ gpt-oss:20b (best quality)
24GB Multiple models, largest models

Grimnir displays your available VRAM in the settings panel.

Using Custom Models (Advanced)

Warning: Custom models are use at your own risk. Only the built-in download buttons have been tested. Custom models may crash, produce poor results, or fail to load.

You can use any GGUF model file:

  1. Download a .gguf file from Hugging Face
  2. Click Browse... in the Grimnir settings
  3. Select your .gguf file
  4. The model path will be saved automatically

Recommended Quantizations:

  • DO use: Q4_K_M, Q5_K_M (stable, good quality), Q8_0 (highest quality)
  • AVOID: IQ2, IQ3 quantization (unstable, may crash)

Recommended Sources:

Grimnir vs. Other Options

Feature Grimnir Ollama LM Studio Cloud APIs
Setup None Install app Install app Get API key
Cost Free Free Free Pay per use
Speed Fast Fast Fast Fastest
Privacy Complete Local Local Data sent to cloud
Offline Yes Yes Yes No
Quality Good Good Good Best

Troubleshooting

Model fails to load

  • Insufficient VRAM - Try a smaller model
  • File corrupted - Re-download the model
  • Wrong format - Grimnir only supports .gguf files

Very slow generation

  • Check GPU is being used (not CPU fallback)
  • Close other GPU-intensive applications
  • Try a smaller/faster model
  • Update your GPU drivers

Out of memory

  • Use a smaller model (3B instead of 7B)
  • Close other applications
  • Try Q4 quantized versions instead of Q8

Technical Details

Grimnir is powered by llama.cpp, the industry-standard engine for running LLMs locally. It supports CUDA GPU acceleration (NVIDIA), Flash Attention for speed, automatic chat template detection, and temperature/sampling controls.

The engine (internally called "Skuld") is built into Loki Studio - no external dependencies required.

See also: Local LLM Setup for Ollama/LM Studio | Remote LLM Setup for cloud APIs

← LUFS Audio Guide Local LLM Setup →
Buy me a coffee