What is Grimnir?
Grimnir (named after Odin's "Masked One" persona) is Loki Studio's built-in AI engine. It runs large language models directly on your computer using your GPU.
No API keys. No external services. Just download a model and go.
$0
Zero Cost
No API fees ever
🔒
Complete Privacy
All processing local
📴
Offline Capable
Works without internet
⚡
GPU Accelerated
Fast inference
Quick Start
Step 1: Download a Model
- Go to Application Settings > LLM Providers
- Select Grimnir (Built-in LLM) mode
- Click any Download button to get a model
- Wait for download to complete
Step 2: Select Your Model
The downloaded model will appear in the dropdown. Select it and you're ready to go!
Step 3: Use It
Go to the Metadata tab and generate titles, descriptions, tags - all powered by your local model. No API keys or external services needed.
Available Models
| Model |
Size |
VRAM |
Best For |
| llama3.2:3b |
~2GB |
4GB |
Fast, good for most tasks |
| qwen2.5-coder:7b |
~5GB |
8GB |
Code/technical content |
| mistral:7b |
~4GB |
8GB |
High quality alternative |
| phi3:mini |
~2GB |
4GB |
Smallest, fastest |
| gpt-oss:20b |
~12GB |
16GB+ |
Highest quality |
VRAM Requirements
| Your VRAM |
Recommended Models |
| 4GB |
phi3:mini, llama3.2:3b |
| 8GB |
qwen2.5-coder:7b, mistral:7b |
| 12GB |
DeepSeek-R1-Distill-Qwen-7B |
| 16GB+ |
gpt-oss:20b (best quality) |
| 24GB |
Multiple models, largest models |
Grimnir displays your available VRAM in the settings panel.
Using Custom Models (Advanced)
Warning: Custom models are use at your own risk. Only the built-in download buttons have been tested. Custom models may crash, produce poor results, or fail to load.
You can use any GGUF model file:
- Download a
.gguf file from Hugging Face
- Click Browse... in the Grimnir settings
- Select your
.gguf file
- The model path will be saved automatically
Recommended Quantizations:
- DO use: Q4_K_M, Q5_K_M (stable, good quality), Q8_0 (highest quality)
- AVOID: IQ2, IQ3 quantization (unstable, may crash)
Recommended Sources:
Grimnir vs. Other Options
| Feature |
Grimnir |
Ollama |
LM Studio |
Cloud APIs |
| Setup |
None |
Install app |
Install app |
Get API key |
| Cost |
Free |
Free |
Free |
Pay per use |
| Speed |
Fast |
Fast |
Fast |
Fastest |
| Privacy |
Complete |
Local |
Local |
Data sent to cloud |
| Offline |
Yes |
Yes |
Yes |
No |
| Quality |
Good |
Good |
Good |
Best |
Troubleshooting
Model fails to load
- Insufficient VRAM - Try a smaller model
- File corrupted - Re-download the model
- Wrong format - Grimnir only supports .gguf files
Very slow generation
- Check GPU is being used (not CPU fallback)
- Close other GPU-intensive applications
- Try a smaller/faster model
- Update your GPU drivers
Out of memory
- Use a smaller model (3B instead of 7B)
- Close other applications
- Try Q4 quantized versions instead of Q8
Technical Details
Grimnir is powered by llama.cpp, the industry-standard engine for running LLMs locally. It supports CUDA GPU acceleration (NVIDIA), Flash Attention for speed, automatic chat template detection, and temperature/sampling controls.
The engine (internally called "Skuld") is built into Loki Studio - no external dependencies required.
See also: Local LLM Setup for Ollama/LM Studio | Remote LLM Setup for cloud APIs