Highlights of Meituan's ACL 2024 Papers: Speculative Decoding, Graph‑Structured Decoding, DolphCoder, and Instruction Fine‑tuning
This article reviews four ACL 2024 papers authored by Meituan’s research team—covering training cost reduction, speculative decoding, code generation optimization, and instruction fine‑tuning—while also announcing a live sharing session at the conference.
Overview
The Meituan technology team selected four papers accepted at ACL 2024 and provides detailed analyses of each work. The topics span training‑cost optimization, speculative decoding techniques, code‑generation improvements, and a deeper investigation of instruction fine‑tuning (IFT). The article also invites readers to a live streaming session on August 12 at 17:00 and a booth (No. 11) at the conference.
1. Speculative Decoding via Early‑exiting for Faster LLM Inference with Thompson Sampling Control Mechanism
Problem: Large language model (LLM) inference incurs high computational cost, limiting practical deployment.
Method: The authors introduce Early‑Exiting Speculative Decoding (EESD). After the first N layers, the model exits early to generate draft tokens, which are refined using a self‑distillation step. A Thompson‑sampling‑based controller automatically decides how many draft tokens to produce each round. The original LLM then validates the draft tokens in a single forward pass, guaranteeing that the final output matches standard autoregressive decoding.
Results: Experiments on 13‑billion and 70‑billion parameter models show a lossless speedup in token generation compared with prior methods, confirming the effectiveness of EESD.
2. Graph‑Structured Speculative Decoding
Problem: Conventional speculative decoding relies on a single hypothesis from a draft model, limiting the potential speedup.
Method: The authors generate multiple draft hypotheses and organize them in a directed acyclic graph (DAG). The DAG merges duplicate token sequences, allowing the system to predict and combine repeated tokens efficiently. This approach, called Graph‑Structured Decoding (GSD), reduces the computational burden of the draft model.
Results: Applying GSD to several LLMs, including a 70‑billion‑parameter LLaMA‑2, yields a generation speed increase of 1.73× – 1.96× over standard speculative decoding.
3. DolphCoder: Echo‑Locating Code Large Language Models with Diverse and Multi‑Objective Instruction Tuning
Problem: Existing Code LLMs achieve strong performance, yet further gains are needed for code generation tasks.
Method: The paper proposes DolphCoder, a self‑evaluating, diverse‑instruction model. It learns multiple instruction objectives and incorporates code‑evaluation goals, encouraging the model to produce varied yet correct solutions. The training combines diverse response generation with an internal code correctness estimator.
Results: DolphCoder outperforms baselines on HumanEval and MBPP benchmarks. The authors highlight two findings: (1) diverse instruction paths improve the model’s coding ability, and (2) better evaluation of solution correctness simultaneously enhances code creation.
4. Learning or Self‑aligning? Rethinking Instruction Fine‑tuning
Problem: Instruction fine‑tuning (IFT) is a core step for adapting LLMs, but the underlying mechanism—whether it injects new knowledge or merely aligns existing knowledge—is unclear.
Method: The authors design a knowledge‑perturbation analysis framework that separates behavior‑pattern changes from additional knowledge injection. By perturbing internal knowledge representations before and after IFT, they assess the contribution of each factor.
Findings: Experiments reveal that attempting to learn extra knowledge via IFT often yields no benefit or even harms performance. Maintaining internal knowledge consistency before and after fine‑tuning is crucial for successful IFT. The study concludes that IFT primarily works by self‑aligning the model’s existing knowledge rather than by learning new information.
Event Announcement
Meituan will host a booth (No. 11) at the ACL 2024 venue and stream a live paper‑reading session on August 12 at 17:00. Attendees are encouraged to reserve a spot and engage with the authors and technical experts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
