Artificial Intelligence 6 min read

Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

At the ICLR 2025 live session in Singapore, Ant Group showcased four cutting‑edge papers—CodePlan, Animate‑X, Group Position Embedding, and OmniKV—demonstrating advances in large‑language‑model reasoning, universal character animation, layout‑aware document understanding, and efficient long‑context inference.

AntTech

Apr 10, 2025

Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

The International Conference on Learning Representations (ICLR) 2025 will be held in Singapore from April 24‑28, where Ant Group has had 17 papers accepted, covering topics such as large‑model inference optimization and generative AI. This live paper showcase focuses on four standout contributions.

CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code‑form Planning proposes a framework that generates code‑style plans (pseudo‑code) to improve complex reasoning tasks like mathematical and symbolic reasoning, achieving a 25.1% performance boost across multiple benchmarks without task‑specific data.

Animate‑X: Universal Character Image Animation with Enhanced Motion Representation is a joint effort by Alibaba and Ant Group that converts static character images into dynamic videos, preserving identity and motion consistency through implicit and explicit pose indicators, and outperforms existing methods for games, entertainment, and metaverse applications.

Group Position Embedding (GPE): Enhancing Document Understanding with Layout Information introduces a lightweight method that groups attention heads and supplies independent positional encodings to capture document layout, requiring no architectural changes or extra pre‑training, and demonstrates significant gains on five document tasks and the new BLADE benchmark.

OmniKV: Dynamic Context Selection for Efficient Long‑Context LLMs offers a token‑preserving, training‑free inference technique that reduces KV‑cache memory usage by up to 75% and accelerates inference by 1.68×, extending the maximum context length of Llama‑3‑8B on a single A100 from 128K to 450K tokens.

The live session also includes author introductions and a schedule for the broadcast on April 10, 2025, across multiple Ant Tech video platforms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

large language models Reasoning long-context Multimodal AI research document understanding

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.