Commit b196cc

2025-08-28 11:22:45 lhl: link to performance page
AI/AI-Capabilities-Overview.md ..
@@ 144,9 144,8 @@
If you are using the llama.cpp ROCm backend, you may want to also try to use the hipBLASlt kernel with the `ROCBLAS_USE_HIPBLASLT=1` environment variable as it is sometimes faster than the default rocBLAS kernels.
-
- TODO: add a list of some models that work well with pp512/tg128, memory usage, model architecture, weight sizes?
-
+ For more on how to test llama.cpp performance:
+ - [[AI/llamacpp-performance]]
### Additional Resources
- Deep dive into LLM usage on Strix Halo: https://llm-tracker.info/_TOORG/Strix-Halo
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9