TinyWeights.dev

clevis@tinyweights:/home$ cat ./welcome.txt

 ████████╗██╗███╗   ██╗██╗   ██╗
 ╚══██╔══╝██║████╗  ██║╚██╗ ██╔╝
    ██║   ██║██╔██╗ ██║ ╚████╔╝
    ██║   ██║██║╚██╗██║  ╚██╔╝
    ██║   ██║██║ ╚████║   ██║
    ╚═╝   ╚═╝╚═╝  ╚═══╝   ╚═╝

 ██╗    ██╗███████╗██╗ ██████╗ ██╗  ██╗████████╗███████╗
 ██║    ██║██╔════╝██║██╔════╝ ██║  ██║╚══██╔══╝██╔════╝
 ██║ █╗ ██║█████╗  ██║██║  ███╗███████║   ██║   ███████╗
 ██║███╗██║██╔══╝  ██║██║   ██║██╔══██║   ██║   ╚════██║
 ╚███╔███╔╝███████╗██║╚██████╔╝██║  ██║   ██║   ███████║
  ╚══╝╚══╝ ╚══════╝╚═╝ ╚═════╝ ╚═╝  ╚═╝   ╚═╝   ╚══════╝

> Small language models like Gemma, Phi, SmolLM, and Qwen, run locally. Benchmarks, quantization, and hands-on deployment guides for small LLMs on real hardware. > New releases tested the week they drop. Straight talk on what is worth running and what is not. 46 posts · last updated 2026-07-19 · all writing CC BY 4.0

clevis@tinyweights:/home$ ls -lh --sort=time

07-09 9min comparisons The Best Local Vision Language Models in 2026: Small VLMs You Can Actually Run 07-08 8min guides How to Run North Mini Code Locally: Cohere's 30B-A3B Coding Model 07-07 9min guides Local LLM for Private Document Q&A: Build an Offline RAG Pipeline 07-02 7min guides How to Quantize a Model with llama.cpp: From Safetensors to GGUF 07-01 9min guides DiffusionGemma: Google's Text-Diffusion Model That Generates 4x Faster, and How to Run It Locally 06-30 8min guides How to Run LFM2.5-8B-A1B Locally: Liquid AI's On-Device Tool-Calling MoE 06-08 6min guides How to Run Phi-4-mini-reasoning Locally: Microsoft's 3.8B Math Model 06-07 7min guides How to Run Qwen3.6-35B-A3B Locally: A 35B MoE Model on One GPU 06-04 6min guides Run IBM Granite 4.1 Locally: 3B, 8B, and 30B Setup Guide 06-01 12min guides 1-bit LLMs Explained: How BitNet's Ternary Weights Actually Work