Tag

MLA

1 views collected around this technical thread.

IT Services Circle
IT Services Circle
Feb 27, 2025 · Artificial Intelligence

DeepSeek Announces FlashMLA: An Efficient Multi‑Layer Attention Decoding Kernel for Hopper GPUs

DeepSeek’s OpenSourceWeek introduced FlashMLA, a GPU‑optimized MLA decoding kernel for Hopper GPUs that leverages FlashAttention and CUTLASS to dramatically improve large‑model inference performance, with early adoption showing up to 30% higher compute utilization and doubled speed in some scenarios.

DeepSeekFlashMLAGPU
0 likes · 3 min read
DeepSeek Announces FlashMLA: An Efficient Multi‑Layer Attention Decoding Kernel for Hopper GPUs
IT Architects Alliance
IT Architects Alliance
Feb 26, 2025 · Artificial Intelligence

DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies

The article provides an in‑depth overview of DeepSeek’s large language model, detailing its mixture‑of‑experts and Transformer foundations, novel attention mechanisms, load‑balancing, multi‑token prediction, FP8 mixed‑precision training, and various training regimes such as knowledge distillation and reinforcement learning.

DeepSeekFP8Large Language Model
0 likes · 18 min read
DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies
Data Thinking Notes
Data Thinking Notes
Feb 11, 2025 · Artificial Intelligence

Why DeepSeek V3 and R1 Are Redefining LLM Efficiency and Power

This article analyzes DeepSeek's V3 and R1 large language models, detailing their low‑cost Mixture‑of‑Experts architecture, Multi‑Head Latent Attention redesign, distributed training optimizations, and reasoning‑focused innovations that together challenge traditional GPU/NPU compute demands.

AI inferenceDeepSeekLarge Language Model
0 likes · 15 min read
Why DeepSeek V3 and R1 Are Redefining LLM Efficiency and Power
Cognitive Technology Team
Cognitive Technology Team
Feb 9, 2025 · Artificial Intelligence

A Beginner’s Guide to the History and Key Concepts of Deep Learning

From the perceptron’s inception in 1958 to modern Transformer-based models like GPT, this article traces the evolution of deep learning, explaining foundational architectures such as DNNs, CNNs, RNNs, LSTMs, attention mechanisms, and recent innovations like DeepSeek’s MLA, highlighting their principles and impact.

AttentionGPTMLA
0 likes · 19 min read
A Beginner’s Guide to the History and Key Concepts of Deep Learning