Tagged articles

13 articles

Page 1 of 1

May 24, 2026 · Artificial Intelligence

LM Studio Adds MTP Support, Boosting Qwen3.6‑35B to ~130 Tokens/s

LM Studio 0.4.14+ now implements Multi‑Token Prediction (MTP) speculative decoding, eliminating the need for a separate draft model and delivering roughly double the token throughput—e.g., Qwen3.6‑35B reaches about 130 tokens/s on RTX 3090—while providing a six‑step activation guide and a list of known pitfalls.

LM StudioMTPQwen3.6

0 likes · 6 min read

LM Studio Adds MTP Support, Boosting Qwen3.6‑35B to ~130 Tokens/s

AI Algorithm Path

Apr 21, 2026 · Artificial Intelligence

Run Claude Code Locally or in the Cloud in 5 Minutes with Ollama, LM Studio, llama.cpp, and OpenRouter

This guide shows how to configure Claude Code to run on local or cloud models within five minutes, covering hardware requirements, recommended models, step‑by‑step installation for Ollama, llama.cpp, LM Studio, and cloud‑based options, plus performance and cost comparisons.

AI model deploymentClaude CodeLM Studio

0 likes · 12 min read

Run Claude Code Locally or in the Cloud in 5 Minutes with Ollama, LM Studio, llama.cpp, and OpenRouter

Old Zhang's AI Learning

Apr 18, 2026 · Artificial Intelligence

How to Run MiniMax‑M2.7 on Mac: Comparing Two Quantization Paths

This article explains why standard uniform quantization fails for the 228‑billion‑parameter MiniMax‑M2.7 MoE model on macOS, and compares two practical solutions—JANGTQ + MLX Studio with 2‑bit mixed‑precision achieving 91.5 % MMLU using 56.5 GB, and LM Studio + GGUF which is easier but requires at least 138 GB RAM and yields lower accuracy.

JANGTQLM StudioMLX Studio

0 likes · 8 min read

How to Run MiniMax‑M2.7 on Mac: Comparing Two Quantization Paths

Old Zhang's AI Learning

Mar 16, 2026 · Artificial Intelligence

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

The article evaluates the GGUF‑quantized Claude‑Opus‑4.6 distilled Qwen3.5 9B model on a 16 GB Mac Mini M4 using LM Studio, detailing model sizes, performance metrics, deployment steps, API integration with Claude Code, and concluding that while the 9B version is usable, its capabilities remain limited compared to larger models.

Claude OpusGGUFLM Studio

0 likes · 12 min read

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

Old Zhang's AI Learning

Mar 4, 2026 · Artificial Intelligence

How to Turn Thinking Mode On or Off for Qwen3.5 Models in Ollama, LM Studio, llama.cpp, and vLLM

This guide shows step‑by‑step how to enable or disable the thinking mode of Qwen3.5 series large language models across Ollama, LM Studio (GGUF and MLX), llama.cpp, and vLLM/SGLang using command‑line flags, custom model YAML files, and API parameters.

LM StudioOllamaThinking mode

0 likes · 4 min read

How to Turn Thinking Mode On or Off for Qwen3.5 Models in Ollama, LM Studio, llama.cpp, and vLLM

Old Zhang's AI Learning

Mar 4, 2026 · Artificial Intelligence

Unlock the Full Power of LM Studio for Local LLM Deployment

This article explores LM Studio’s evolution into a complete local AI development platform, detailing version 0.4’s architectural overhaul, headless daemon, parallel request handling, stateful REST API, UI refresh, and a suite of hidden developer features such as OpenAI‑compatible, Anthropic‑compatible APIs, CLI tools, native SDKs, and the LM Link remote‑model solution.

Anthropic APICLILM Link

0 likes · 12 min read

Unlock the Full Power of LM Studio for Local LLM Deployment

Old Zhang's AI Learning

Feb 26, 2026 · Artificial Intelligence

How to Disable Thinking Output in Qwen3.5 Models Using LM Studio

This guide explains how to turn off the reasoning (thinking) output of Qwen3.5 series large language models in LM Studio by creating a virtual “-no‑thinking” model directory, editing a model.yaml file, and handling common pitfalls and error messages.

AI model configurationLM Studiodisable thinking

0 likes · 8 min read

How to Disable Thinking Output in Qwen3.5 Models Using LM Studio

Eric Tech Circle

Aug 3, 2025 · Artificial Intelligence

How to Deploy Qwen3‑Coder Locally and Boost Front‑End Development

This article explains the key improvements of Qwen3‑Coder, walks through two local deployment methods (LM Studio and Ollama), showcases front‑end coding examples, compares performance and hardware requirements, and offers practical recommendations for developers seeking an on‑premise AI coding assistant.

AI Code GenerationLM StudioOllama

0 likes · 7 min read

How to Deploy Qwen3‑Coder Locally and Boost Front‑End Development

Eric Tech Circle

May 6, 2025 · Artificial Intelligence

How to Deploy Qwen3-30B-A3B Locally and Unlock Its Full AI Potential

This article walks through the complete process of installing the Qwen3-30B-A3B large language model on a personal computer using LM Studio, evaluates its reasoning, creative, multilingual, and coding abilities with detailed prompts, and shares practical tips for optimizing local deployment and prompt design.

AI evaluationLM StudioQwen3

0 likes · 12 min read

How to Deploy Qwen3-30B-A3B Locally and Unlock Its Full AI Potential

JavaEdge

Apr 26, 2025 · Artificial Intelligence

Turn LM Studio into a Local OpenAI‑Compatible API Server

This guide shows how to select a model in LM Studio, expose a local port, start the HTTP server, and interact with it via curl commands, covering quick model listing, chat requests, and the difference between streaming and full‑response modes.

AIAPILM Studio

0 likes · 5 min read

Turn LM Studio into a Local OpenAI‑Compatible API Server

21CTO

Feb 6, 2025 · Artificial Intelligence

Run DeepSeek R1 Locally for Free – Integrate AI into VSCode with LM Studio, Ollama, Jan

This guide shows how to set up the free, open‑source DeepSeek R1 large language model locally using LM Studio, Ollama, or Jan, choose the appropriate model size for your hardware, and integrate it into Visual Studio Code as a code‑assistant without any cost.

Artificial IntelligenceDeepSeek-R1Jan

0 likes · 8 min read

Run DeepSeek R1 Locally for Free – Integrate AI into VSCode with LM Studio, Ollama, Jan

21CTO

Apr 22, 2024 · Artificial Intelligence

Run Llama 3 Locally on PC/Mac: Ollama, LM Studio & GPT4All Guide

This guide walks you through three practical methods—using Ollama, LM Studio, and GPT4All—to install and run the open‑source Llama 3 model locally on Windows, macOS, or Ubuntu, including command‑line usage, Python integration, and prompt‑engineering techniques for formatted outputs.

GPT4AllLM StudioLlama3

0 likes · 5 min read

Run Llama 3 Locally on PC/Mac: Ollama, LM Studio & GPT4All Guide

Eric Tech Circle

Apr 18, 2024 · Artificial Intelligence

Hands‑On Review of LM Studio: Install, Run, and Evaluate Open‑Source LLMs on Windows

This article walks through installing LM Studio on a Windows PC, downloading models from Hugging Face, using the AI Chat interface (including a Codellama‑generated Snake game), measuring resource usage, exploring the built‑in OpenAI‑compatible API, and summarizing its strengths and limitations.

AI chatHugging FaceLM Studio

0 likes · 5 min read

Hands‑On Review of LM Studio: Install, Run, and Evaluate Open‑Source LLMs on Windows