Evolution process to find the best quant tensor weights to build the most optimal GGUF options for an AI model.
-
Updated
May 13, 2026
Evolution process to find the best quant tensor weights to build the most optimal GGUF options for an AI model.
CLI/GUI tool for efficient and easy safetensors and gguf model conversion
Análise Avançada de Dados com Causalidade e Aprendizado por Reforço
Convert and quantize llm models
Unified Local AI Interface & LLM Runtime (Support GGUF, Ollama, OpenAI, Gemini, etc.). Insearch of building sovereign AI system ✨
Auto GGUF Converter for HuggingFace Hub Models with Multiple Quantizations (GGUF Format)
Convert Hugging Face models to GGUF with xet support.
AI Toolchain in Pure Zig, No Python, No C++
go-mlx native inference, training and research tooling.
Quantize LLMs automatically.
Gemma-4-It fine-tuned on PubMedQA using SFT & RLVR
Deploying LoRAfrica on consumer CPU devices
Ready-to-run Colab notebook to run GLM-4.7-Flash Finetuned on Claude Opus 4.5 xHigh-Reasoning (GGUF) with llama.cpp, featuring GPU/CPU split loading, streaming chat, multi-chat manager, and a Gradio web UI — optimized for free T4 environments.
The initial release of Production Mixed-Rate Allocation- more models coming soon!
Create optimized GGUF quantizations by cloning from any GGUF of the same architecture.
Add a description, image, and links to the gguf-quantization topic page so that developers can more easily learn about it.
To associate your repository with the gguf-quantization topic, visit your repo's landing page and select "manage topics."