Strix Halo HomeLab
Attachments
History
Blame
View Source
Documentation
About An Otter Wiki
Toggle dark mode
Login
Home
A - Z
Changelog
Menu
GitHub Mirror
Discord Server
Page Index
Guides
110GB-of-VRAM
AI-Capabilities
C-States
Hardware-Monitoring
Power-Mode-and-Fan-Control
Power-Modes-and-Performance
Replacing-Thermal-Interfaces-On-GMKtec-EVO-X2
VM-iGPU-Passthrough
Hardware
Boards
Sixunited-AXB35
Firmware
PCs
Bosgame-M5
FEVM-FA-EX9
Framework-Desktop
GMKtec-EVO-X2
HP-Z2-Mini-G1a
Peladn-YO1
Home
Guides
AI-Capabilities
e19d4b
Commit
e19d4b
2025-06-18 21:44:39
deseven
: added llama 4 scout results
Guides/AI-Capabilities.md
..
@@ 15,12 15,13 @@
Some real-life examples (KoboldCPP, Vulkan, full GPU offloading, [example config file](./gemma-3-27b.kcpps)):
-
| Model | Quantization | Prompt Processing | Generation Speed |
-
| ----------------- | ------------ | ----------------- | ---------------- |
-
| Llama 3.3 70B | Q4_K_M | 51.1 t/s | 4.1 t/s |
-
| Gemma 3 27B | Q5_K_M | 94.4 t/s | 6.2 t/s |
-
| Qwen3 30B A3B | Q5_K_M | 94.5 t/s | **27.8 t/s** |
-
| GLM 4 9B | Q5_K_M | **273.7 t/s** | 15.0 t/s |
+
| Model | Quantization | Prompt Processing | Generation Speed |
+
| --------------------- | ------------ | ----------------- | ---------------- |
+
| Llama 4 Scout 17B 16E | Q4_K_XL | 106.0 t/s | 12.7 t/s |
+
| Llama 3.3 70B | Q4_K_M | 51.1 t/s | 4.1 t/s |
+
| Gemma 3 27B | Q5_K_M | 94.4 t/s | 6.2 t/s |
+
| Qwen3 30B A3B | Q5_K_M | 94.5 t/s | **27.8 t/s** |
+
| GLM 4 9B | Q5_K_M | **273.7 t/s** | 15.0 t/s |
Much more info here: https://llm-tracker.info/_TOORG/Strix-Halo
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9