Tagged articles
2 articles
Page 1 of 1
AI Engineer Programming
AI Engineer Programming
Apr 25, 2026 · Artificial Intelligence

Quantization Across Signal Processing, AI Inference, and RAG Vector Search

This article explains how quantization—originating from signal processing—reduces precision to save resources, details its application to neural network weights and activations via PTQ, QAT, GPTQ, AWQ, and SmoothQuant, and shows how vector quantization enables fast, memory‑efficient retrieval in large‑scale RAG systems.

AWQGPTQLLM
0 likes · 19 min read
Quantization Across Signal Processing, AI Inference, and RAG Vector Search
AI Algorithm Path
AI Algorithm Path
Apr 22, 2025 · Artificial Intelligence

Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained

The article walks through the fundamentals of large‑language‑model quantization, presenting a concrete int8 example, detailed explanations of GPTQ, GGUF/GGML, QAT, and AWQ methods, and provides step‑by‑step code snippets, formulas, calibration procedures, and performance observations for each technique.

AWQGGMLGGUF
0 likes · 15 min read
Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained