Tag

CPU inference

1 views collected around this technical thread.

DataFunTalk
DataFunTalk
Apr 19, 2025 · Artificial Intelligence

Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment

Microsoft Research released BitNet b1.58 2B4T, the first open‑source native 1‑bit large language model with 2 billion parameters, 1.58‑bit effective precision and a 0.4 GB footprint, achieving full‑precision performance while enabling efficient CPU and GPU inference for edge AI applications.

1-bit quantizationCPU inferenceLLM
0 likes · 10 min read
Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment
ByteDance Cloud Native
ByteDance Cloud Native
Feb 21, 2025 · Artificial Intelligence

Deploy DeepSeek‑R1‑Distill on Volcengine CPU Cloud for Low‑Cost AI Inference

This guide walks you through deploying the DeepSeek‑R1‑Distill model on Volcengine CPU ECS instances, covering use‑case scenarios, recommended server types, Docker setup, environment configuration, and verification steps to achieve cost‑effective, high‑compatibility AI inference.

AI Model DeploymentCPU inferenceDeepSeek
0 likes · 6 min read
Deploy DeepSeek‑R1‑Distill on Volcengine CPU Cloud for Low‑Cost AI Inference