Artificial Intelligence 6 min read

Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

At the ICLR 2025 live session in Singapore, Ant Group showcased four cutting‑edge papers—CodePlan, Animate‑X, Group Position Embedding, and OmniKV—demonstrating advances in large‑language‑model reasoning, universal character animation, layout‑aware document understanding, and efficient long‑context inference.

AntTech
AntTech
AntTech
Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

The International Conference on Learning Representations (ICLR) 2025 will be held in Singapore from April 24‑28, where Ant Group has had 17 papers accepted, covering topics such as large‑model inference optimization and generative AI. This live paper showcase focuses on four standout contributions.

CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code‑form Planning proposes a framework that generates code‑style plans (pseudo‑code) to improve complex reasoning tasks like mathematical and symbolic reasoning, achieving a 25.1% performance boost across multiple benchmarks without task‑specific data.

Animate‑X: Universal Character Image Animation with Enhanced Motion Representation is a joint effort by Alibaba and Ant Group that converts static character images into dynamic videos, preserving identity and motion consistency through implicit and explicit pose indicators, and outperforms existing methods for games, entertainment, and metaverse applications.

Group Position Embedding (GPE): Enhancing Document Understanding with Layout Information introduces a lightweight method that groups attention heads and supplies independent positional encodings to capture document layout, requiring no architectural changes or extra pre‑training, and demonstrates significant gains on five document tasks and the new BLADE benchmark.

OmniKV: Dynamic Context Selection for Efficient Long‑Context LLMs offers a token‑preserving, training‑free inference technique that reduces KV‑cache memory usage by up to 75% and accelerates inference by 1.68×, extending the maximum context length of Llama‑3‑8B on a single A100 from 128K to 450K tokens.

The live session also includes author introductions and a schedule for the broadcast on April 10, 2025, across multiple Ant Tech video platforms.

Large Language Modelsreasoninglong contextmultimodalAI researchDocument Understanding
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.