Grimnir Setup - Loki Studio Documentation

What is Grimnir?

Grimnir (named after Odin's "Masked One" persona) is Loki Studio's built-in AI engine. It runs large language models directly on your computer using your GPU.

No API keys. No external services. Just download a model and go.

Zero Cost

No API fees ever

🔒

Complete Privacy

All processing local

📴

Offline Capable

Works without internet

⚡

GPU Accelerated

Fast inference

Quick Start

Step 1: Download a Model

Go to Application Settings > LLM Providers
Select Grimnir (Built-in LLM) mode
Click any Download button to get a model
Wait for download to complete

Step 2: Select Your Model

The downloaded model will appear in the dropdown. Select it and you're ready to go!

Step 3: Use It

Go to the Metadata tab and generate titles, descriptions, tags - all powered by your local model. No API keys or external services needed.

Available Models

Model	Size	VRAM	Best For
llama3.2:3b	~2GB	4GB	Fast, good for most tasks
qwen2.5-coder:7b	~5GB	8GB	Code/technical content
mistral:7b	~4GB	8GB	High quality alternative
phi3:mini	~2GB	4GB	Smallest, fastest
gpt-oss:20b	~12GB	16GB+	Highest quality

VRAM Requirements

Your VRAM	Recommended Models
4GB	phi3:mini, llama3.2:3b
8GB	qwen2.5-coder:7b, mistral:7b
12GB	DeepSeek-R1-Distill-Qwen-7B
16GB+	gpt-oss:20b (best quality)
24GB	Multiple models, largest models

Grimnir displays your available VRAM in the settings panel.

Using Custom Models (Advanced)

Warning: Custom models are use at your own risk. Only the built-in download buttons have been tested. Custom models may crash, produce poor results, or fail to load.

You can use any GGUF model file:

Download a .gguf file from Hugging Face
Click Browse... in the Grimnir settings
Select your .gguf file
The model path will be saved automatically

Recommended Quantizations:

DO use: Q4_K_M, Q5_K_M (stable, good quality), Q8_0 (highest quality)
AVOID: IQ2, IQ3 quantization (unstable, may crash)

Recommended Sources:

bartowski on HuggingFace - High quality GGUF conversions
unsloth on HuggingFace - Optimized models
lmstudio-community - Well-tested GGUFs

Grimnir vs. Other Options

Feature	Grimnir	Ollama	LM Studio	Cloud APIs
Setup	None	Install app	Install app	Get API key
Cost	Free	Free	Free	Pay per use
Speed	Fast	Fast	Fast	Fastest
Privacy	Complete	Local	Local	Data sent to cloud
Offline	Yes	Yes	Yes	No
Quality	Good	Good	Good	Best

Troubleshooting

Model fails to load

Insufficient VRAM - Try a smaller model
File corrupted - Re-download the model
Wrong format - Grimnir only supports .gguf files

Very slow generation

Check GPU is being used (not CPU fallback)
Close other GPU-intensive applications
Try a smaller/faster model
Update your GPU drivers

Out of memory

Use a smaller model (3B instead of 7B)
Close other applications
Try Q4 quantized versions instead of Q8

Technical Details

Grimnir is powered by llama.cpp, the industry-standard engine for running LLMs locally. It supports CUDA GPU acceleration (NVIDIA), Flash Attention for speed, automatic chat template detection, and temperature/sampling controls.

The engine (internally called "Skuld") is built into Loki Studio - no external dependencies required.

See also: Local LLM Setup for Ollama/LM Studio | Remote LLM Setup for cloud APIs