Tag

model simplification

1 views collected around this technical thread.

AntTech
AntTech
Dec 5, 2024 · Artificial Intelligence

Simplifying Deep Learning: Research Overview by Prof. Yao Quanming

Prof. Yao Quanming presents a comprehensive overview of his research on simplifying deep learning, discussing scaling laws, data, compute and trust bottlenecks, and proposing minimalist approaches in model design, training, and interpretability, with a focus on drug interaction prediction using graph neural networks.

AI researchGraph Neural Networksdeep learning
0 likes · 17 min read
Simplifying Deep Learning: Research Overview by Prof. Yao Quanming
Model Perspective
Model Perspective
Oct 21, 2024 · Fundamentals

Why “Good Enough” Models Beat Perfect Ones: Insights from Economics & Weather

Mathematical modeling thrives on useful approximations rather than flawless precision, as illustrated by Keynes’s economic insights, Box’s famous quote, and real‑world examples like weather forecasting and epidemic models, showing that simplified, “good enough” models often provide more actionable guidance amid complexity and uncertainty.

complexitymathematical modelingmodel simplification
0 likes · 5 min read
Why “Good Enough” Models Beat Perfect Ones: Insights from Economics & Weather
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 8, 2023 · Artificial Intelligence

Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance

A recent ETH Zurich paper shows that standard Transformer blocks can be drastically simplified by removing residual connections, LayerNorm, projection and value parameters, and even MLP sub‑block components, achieving up to 16% fewer parameters and comparable training speed and downstream performance on both GPT‑style decoders and BERT models.

AILLMdeep learning
0 likes · 11 min read
Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance