Tag

Model Deployment

2 views collected around this technical thread.

DaTaobao Tech
DaTaobao Tech
Jun 4, 2025 · Artificial Intelligence

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

This article provides a comprehensive overview of large language models (LLMs), covering their transformer architecture, parameter counts, GPU memory and storage requirements, and detailed fine‑tuning methods such as prompt engineering, data construction, LoRA, PEFT, RLHF, and DPO, along with practical deployment and inference acceleration strategies.

DPOFine-tuningLLM
0 likes · 17 min read
Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques
Architect
Architect
May 31, 2025 · Artificial Intelligence

Edge Intelligence Implementation in the Vivo Official App: Architecture, Feature Engineering, and Model Deployment

The article details how edge intelligence is applied to the Vivo official app to improve product recommendation on the smart‑hardware floor by abstracting the problem, designing feature engineering pipelines, training TensorFlow models, converting them to TFLite, and deploying inference on mobile devices, while also covering monitoring and performance considerations.

Feature EngineeringModel DeploymentTensorFlow Lite
0 likes · 19 min read
Edge Intelligence Implementation in the Vivo Official App: Architecture, Feature Engineering, and Model Deployment
Top Architect
Top Architect
Mar 22, 2025 · Artificial Intelligence

Spring AI: Intelligent Development Trend for Java Developers

The article introduces Spring AI as an emerging tool for Java developers, explains its background, goals, and core components such as data processing, model training, deployment and monitoring, showcases application scenarios like NLP, image processing, recommendation systems and predictive analytics, and also includes promotional offers for AI resources and community groups.

Artificial IntelligenceJavaModel Deployment
0 likes · 17 min read
Spring AI: Intelligent Development Trend for Java Developers
Top Architecture Tech Stack
Top Architecture Tech Stack
Mar 22, 2025 · Artificial Intelligence

Spring AI: An Overview of Intelligent Development Trends

This article introduces Spring AI, a Spring ecosystem module that simplifies building, training, and deploying AI applications for Java developers, covering its background, goals, core components such as data processing, model training, deployment, practical code examples, use cases, advantages, challenges, and future outlook.

Artificial IntelligenceJavaModel Deployment
0 likes · 12 min read
Spring AI: An Overview of Intelligent Development Trends
Architecture Digest
Architecture Digest
Mar 21, 2025 · Artificial Intelligence

Spring AI: Emerging Trends in Intelligent Development

This article introduces Spring AI, explains its background, goals, core components such as data processing, model training, deployment and monitoring, showcases practical use cases like NLP, image processing and recommendation systems, and discusses its advantages, challenges, and future outlook for Java developers.

Artificial IntelligenceData ProcessingJava
0 likes · 16 min read
Spring AI: Emerging Trends in Intelligent Development
Efficient Ops
Efficient Ops
Mar 9, 2025 · Artificial Intelligence

Essential LLMOps Tools: Build, Deploy, Monitor, and Manage Large Language Models

LLMOps, the end-to-end methodology for managing large language models, encompasses a curated set of development, deployment, monitoring, and local management tools—such as LangChain, vLLM, LangSmith, and Ollama—enabling practitioners to efficiently build, scale, and maintain AI applications.

AI DevelopmentLLMOpsLarge Language Models
0 likes · 6 min read
Essential LLMOps Tools: Build, Deploy, Monitor, and Manage Large Language Models
Architecture & Thinking
Architecture & Thinking
Feb 18, 2025 · Artificial Intelligence

Why Is DeepSeek Server Overloaded? Causes and Practical Workarounds

The article investigates why DeepSeek frequently returns a “server busy” message, analyzing factors such as sudden traffic spikes, compute and bandwidth limitations, security attacks, and maintenance policies, and then offers actionable solutions including query optimization, off‑peak usage, third‑party cloud platforms, and local deployment.

AIDeepSeekModel Deployment
0 likes · 10 min read
Why Is DeepSeek Server Overloaded? Causes and Practical Workarounds
ByteDance Cloud Native
ByteDance Cloud Native
Feb 13, 2025 · Cloud Computing

Deploy the Full‑Size DeepSeek‑R1 Model on Volcengine Cloud with Terraform and Kubernetes

This guide walks you through two practical solutions for deploying the massive DeepSeek‑R1 model on Volcengine Cloud—one using Terraform for a quick two‑node GPU setup and another leveraging cloud‑native multi‑node distributed inference with Kubernetes, covering resource sizing, environment preparation, model download, monitoring, autoscaling, and storage acceleration.

AIKubernetesModel Deployment
0 likes · 22 min read
Deploy the Full‑Size DeepSeek‑R1 Model on Volcengine Cloud with Terraform and Kubernetes
DeWu Technology
DeWu Technology
Feb 12, 2025 · Artificial Intelligence

Edge Intelligence for Intelligent Video Cover Recommendation

The article describes an edge‑based video‑cover recommendation system for DeWu that leverages the MNN SDK and a lightweight MobileNetV3 model, performing on‑device inference with quantization and parallel processing to automatically select high‑quality covers, achieving sub‑second latency and boosting click‑through rates by up to 18 %.

Model DeploymentVideo Coveredge AI
0 likes · 12 min read
Edge Intelligence for Intelligent Video Cover Recommendation
Tencent Tech
Tencent Tech
Feb 4, 2025 · Artificial Intelligence

Deploy and Test DeepSeek Large Language Models on Tencent Cloud TI in Minutes

This guide walks you through quickly deploying DeepSeek series models on the Tencent Cloud TI platform, covering model selection, resource planning, step‑by‑step service creation, free online trial, API testing via built‑in tools or curl, and managing inference services for both large and compact models.

AI inferenceDeepSeekModel Deployment
0 likes · 13 min read
Deploy and Test DeepSeek Large Language Models on Tencent Cloud TI in Minutes
DevOps
DevOps
Jan 6, 2025 · Artificial Intelligence

Ten Popular Large Language Model Deployment Engines and Tools: Features, Advantages, and Limitations

This article reviews ten mainstream LLM deployment solutions—including WebLLM, LM Studio, Ollama, vLLM, LightLLM, OpenLLM, HuggingFace TGI, GPT4ALL, llama.cpp, and Triton Inference Server—detailing their technical characteristics, strengths, drawbacks, and example deployment workflows for both personal and enterprise environments.

AI inferenceGPU AccelerationLLM
0 likes · 16 min read
Ten Popular Large Language Model Deployment Engines and Tools: Features, Advantages, and Limitations
DeWu Technology
DeWu Technology
Dec 11, 2024 · Artificial Intelligence

MLOps Practices for Improving Order Fulfillment Timeliness

The supply‑chain team leveraged core MLOps practices—versioning, testing, automated reproducible pipelines, deployment monitoring, and documentation—to eliminate data leakage, ensure online consistency, and accelerate model upgrades, using traffic‑replay, FAAS‑based decoupling, and approval workflows, ultimately cutting order‑fulfillment times, reducing costs, and enabling business teams to adopt reliable AI models at scale.

Data VersioningModel Deploymentautomation
0 likes · 18 min read
MLOps Practices for Improving Order Fulfillment Timeliness
Test Development Learning Exchange
Test Development Learning Exchange
Dec 5, 2024 · Artificial Intelligence

End-to-End House Prices Prediction Project: Data Collection, Preprocessing, Modeling, Evaluation, and Deployment with Python

This tutorial walks through a complete house price prediction project, covering data collection from Kaggle, preprocessing with pandas and scikit‑learn, model training using RandomForestRegressor, evaluation, and deployment of a Flask API for real‑time predictions, providing full code examples.

FlaskModel DeploymentPython
0 likes · 9 min read
End-to-End House Prices Prediction Project: Data Collection, Preprocessing, Modeling, Evaluation, and Deployment with Python
Test Development Learning Exchange
Test Development Learning Exchange
Nov 25, 2024 · Artificial Intelligence

Complete Machine Learning Project: Data Collection, Cleaning, Feature Engineering, Model Training, Evaluation, and Deployment

This tutorial walks through a complete machine learning project in Python, covering data collection, cleaning, feature engineering, training linear regression, decision tree, and random forest models, evaluating them with cross‑validation, and finally deploying the best model using joblib.

Model DeploymentPython
0 likes · 8 min read
Complete Machine Learning Project: Data Collection, Cleaning, Feature Engineering, Model Training, Evaluation, and Deployment
Baidu Geek Talk
Baidu Geek Talk
Nov 25, 2024 · Artificial Intelligence

PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX

PP‑ShiTuV2, a PaddleX pipeline that integrates subject detection, deep feature encoding, and vector retrieval, delivers 91 % recall@1 on AliProducts, surpasses earlier models by over 20 points, runs efficiently on GPU and CPU, and offers simple installation, quick‑start code, and full fine‑tuning support.

Image RecognitionModel DeploymentPP-ShiTuV2
0 likes · 8 min read
PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX
Baidu Geek Talk
Baidu Geek Talk
Sep 23, 2024 · Artificial Intelligence

Intelligent Early Screening System for Malignant Skin Tumors Based on PaddleX Low‑Code AI

The Meikel Studio team created an intelligent early‑screening system for malignant skin tumors on the PaddleX low‑code AI platform, which automatically captures dermatoscopic images, segments lesions with the PP‑LiteSeg model, achieves high accuracy (mIoU 0.868) and rapid inference, and offers one‑click deployment via RESTful API to improve diagnosis efficiency and support future medical‑imaging applications.

AI SegmentationModel DeploymentPaddleX
0 likes · 9 min read
Intelligent Early Screening System for Malignant Skin Tumors Based on PaddleX Low‑Code AI
DataFunSummit
DataFunSummit
Sep 22, 2024 · Artificial Intelligence

Large Language Models for Intelligent Financial Report Writing: Applications, Implementation, and Future Outlook

This article examines how large language models are currently applied to financial report creation, outlines their technical implementation and challenges, and explores future directions such as multimodal data fusion, personalization, and lightweight deployment on consumer devices.

AIDocument AutomationLarge Language Models
0 likes · 12 min read
Large Language Models for Intelligent Financial Report Writing: Applications, Implementation, and Future Outlook
DeWu Technology
DeWu Technology
Aug 19, 2024 · Artificial Intelligence

Multi‑LoRA Deployment for Large Language Models: Concepts, Fine‑tuning, and Cost‑Effective Strategies

The article introduces a multi‑LoRA strategy that lets many scenario‑specific adapters share a single base LLM, dramatically cutting GPU usage and cost while preserving performance, and explains how to fine‑tune with LoRA, merge adapters, and serve them efficiently using VLLM.

Fine-tuningLarge Language ModelsLoRA
0 likes · 10 min read
Multi‑LoRA Deployment for Large Language Models: Concepts, Fine‑tuning, and Cost‑Effective Strategies
58 Tech
58 Tech
Aug 7, 2024 · Artificial Intelligence

Bridging Compute and Applications: 58.com AI Lab’s Large‑Model Platform and AI Agent Solutions

In this article, 58.com AI Lab senior director Zhan Kunlin explains how the company built a multi‑layer AI platform, created a vertical large‑language model called LingXi, and developed an AI Agent system with RAG capabilities to accelerate practical AI applications across various business scenarios.

AI PlatformAI agentsModel Deployment
0 likes · 10 min read
Bridging Compute and Applications: 58.com AI Lab’s Large‑Model Platform and AI Agent Solutions