Author

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

319

Articles

Likes

234

Views

Comments

Latest from Machine Learning Algorithms & Natural Language Processing

100 recent articles max

Machine Learning Algorithms & Natural Language Processing

May 22, 2026 · Artificial Intelligence

20‑Year‑Old Transformer Co‑author Open‑Sources a 218‑Billion‑Parameter Model

Cohere’s Command A+ model, built by Transformer co‑author Aidan Gomez and backed by Nick Frosst, packs 218 billion parameters but activates only 25 billion at inference, uses a lossless 4‑bit quantization scheme, offers native citation support, runs on a single B200 or two H100 GPUs, and is released under an Apache 2.0 license, marking a major shift toward truly open‑source, enterprise‑ready large language models.

AIApache-2.0Cohere

0 likes · 12 min read

20‑Year‑Old Transformer Co‑author Open‑Sources a 218‑Billion‑Parameter Model

Machine Learning Algorithms & Natural Language Processing

May 22, 2026 · Artificial Intelligence

How a 10M‑Parameter Model Beats Large Models on Sudoku and ARC with Multi‑Trajectory Reasoning

The GRAM model introduced by Yoshua Bengio’s team replaces deterministic recursive updates with probabilistic multi‑trajectory sampling, enabling a 10 M‑parameter network to achieve 97 % accuracy on Sudoku‑Extreme, 52 %/11 % on ARC‑AGI, and near‑perfect results on N‑Queens and graph‑coloring, while also supporting unconditional generation tasks.

ARC‑AGIGRAMSudoku

0 likes · 9 min read

How a 10M‑Parameter Model Beats Large Models on Sudoku and ARC with Multi‑Trajectory Reasoning

Machine Learning Algorithms & Natural Language Processing

May 22, 2026 · Artificial Intelligence

Li Mu Returns to Bilibili with a Real-Time AI Avatar

Li Mu (沐神) returns to Bilibili after a year to showcase Higgs Avatar v1, a fully AI‑generated real‑time digital human that can listen, speak, lip‑sync and display facial expressions, with performance metrics showing 16 ms per frame on a single H100 GPU and potential applications ranging from customer service to training, while also raising ethical considerations about identity and trust.

AI AvatarBoson AIHiggs Avatar

0 likes · 7 min read

Machine Learning Algorithms & Natural Language Processing

May 21, 2026 · Artificial Intelligence

Visual Generation Meets Slow Thinking: Decoding New Multimodal Reasoning Paradigms from CVPR 2026

This article curates ten standout CVPR 2026 papers that introduce novel multimodal interaction frameworks, active video avatars, unified image customization, artistic poster generation, information‑theoretic video compression, all‑purpose visual reasoning models, 3D‑grounded spatial reasoning, interleaved text‑visual generation, and unified fine‑grained video understanding, each achieving state‑of‑the‑art performance.

AI researchCVPRMultimodal

0 likes · 13 min read

Visual Generation Meets Slow Thinking: Decoding New Multimodal Reasoning Paradigms from CVPR 2026

Machine Learning Algorithms & Natural Language Processing

May 21, 2026 · Artificial Intelligence

Can a New Training Objective Make LLMs See Further and Reason Better?

The paper introduces Next‑ToBE, a training‑objective modification that replaces the one‑hot next‑token label with a soft distribution covering a future token window, thereby activating latent anticipatory capacity in large language models and yielding significant gains in token‑hit rates, reasoning accuracy, and training efficiency.

Anticipatory CapacityNext-ToBEToken Prediction

0 likes · 11 min read

Can a New Training Objective Make LLMs See Further and Reason Better?

Machine Learning Algorithms & Natural Language Processing

May 21, 2026 · Artificial Intelligence

Breaking the UED Bottleneck: PACE Locates the Reinforcement‑Learning Zone of Proximal Development

The paper introduces PACE, a Parameter‑Change based Unsupervised Environment Design method that evaluates training levels by the magnitude of induced policy‑parameter updates, offering a low‑variance, computationally cheap signal that consistently outperforms prior UED approaches on MiniGrid and Craftax benchmarks.

CraftaxCurriculum LearningICML 2026

0 likes · 11 min read

Breaking the UED Bottleneck: PACE Locates the Reinforcement‑Learning Zone of Proximal Development

Machine Learning Algorithms & Natural Language Processing

May 21, 2026 · Industry Insights

SpaceX Files Historic $2 Trillion IPO—Musk Poised to Become First Trillion‑Dollar Billionaire

SpaceX has filed an S‑1 seeking $750 billion in financing at a $2 trillion valuation, unveiling a $28.5 trillion total addressable market dominated by AI, detailing 2025 revenue of $186.7 billion, massive AI losses, a $12.5 billion‑per‑month Anthropic contract, and Musk’s stock‑award plan that could make him the world’s first trillion‑dollar billionaire.

AnthropicArtificial IntelligenceElon Musk

0 likes · 9 min read

SpaceX Files Historic $2 Trillion IPO—Musk Poised to Become First Trillion‑Dollar Billionaire

Machine Learning Algorithms & Natural Language Processing

May 20, 2026 · Artificial Intelligence

MLNLP 2026 Symposium: Top AI Scholars from Qiyuan Lab, BIT, Tsinghua & Alibaba Reveal New Agent and Table Research

The MLNLP 2026 academic symposium on May 31 will feature leading AI researchers from Qiyuan Lab, Beijing Institute of Technology, Tsinghua University and Alibaba presenting cutting‑edge work on autonomous agents, table intelligence, multi‑agent learning environments, and the future of general agents.

AI ConferenceChinaMLNLP

0 likes · 8 min read

MLNLP 2026 Symposium: Top AI Scholars from Qiyuan Lab, BIT, Tsinghua & Alibaba Reveal New Agent and Table Research

Machine Learning Algorithms & Natural Language Processing

May 20, 2026 · Artificial Intelligence

How 800 Data Points Halve LLM Chain‑of‑Thought Length and Boost Accuracy

The ICLR‑2026 paper introduces LCPO, a lightweight preference‑optimization technique that uses only 800 curated examples and 50 training steps to cut large‑model chain‑of‑thought generation length by about 50% while maintaining or even improving answer accuracy, dramatically reducing training and inference costs.

Efficient InferenceLCPOLow-Resource Training

0 likes · 8 min read

How 800 Data Points Halve LLM Chain‑of‑Thought Length and Boost Accuracy

Machine Learning Algorithms & Natural Language Processing

May 20, 2026 · Artificial Intelligence

Can 99% Sparse Transformers Run Faster? Insights from the ‘Attention Is All You Need’ Authors

The paper shows that applying lightweight L1 regularization can make over 99% of FFN activations zero, and by using a new tile‑wise ELLPACK (TwELL) format together with a hybrid routing scheme, inference speed improves up to 30% while memory usage drops over 24% and energy consumption is reduced, all with negligible impact on downstream task performance.

CUDAGPU optimizationHybrid Routing

0 likes · 8 min read

Can 99% Sparse Transformers Run Faster? Insights from the ‘Attention Is All You Need’ Authors