Artificial Intelligence 11 min read

Meituan Vision AI Research Highlights and Open‑Source Releases

This article compiles Meituan's cutting‑edge computer‑vision research and engineering achievements—including CVPR award‑winning segmentation, YOLOv6 releases, GPU inference optimizations, the Food2K dataset, and numerous paper digests—to provide practical insights for visual AI practitioners.

Meituan Technology Team

Oct 11, 2023

Meituan Vision AI Research Highlights and Open‑Source Releases

Overview

The article aggregates Meituan's visual AI research and engineering practices, offering direct links and concise overviews of each work to help readers grasp the technical innovations behind Meituan's visual products.

CVPR 2023 Street‑View Segmentation

Meituan's street‑view team built a segmentation system that balances accuracy and efficiency, achieving notable results in real‑world deployment and winning two first‑place and one third‑place awards at CVPR 2023.

CVPR 2023 Selected Papers

Eight papers authored by Meituan were highlighted, covering self‑supervised learning, domain adaptation, federated learning, object detection, tracking, segmentation, and low‑level vision, demonstrating innovations in both generic and vertical visual tasks.

YOLOv6 3.0 Release

The new YOLOv6 3.0 version introduces several technical innovations and optimizations, surpassing YOLOv7‑E6E in both detection accuracy and inference speed, and setting a new state‑of‑the‑art benchmark for real‑time object detection.

Large‑Scale Food Image Recognition (Food2K)

In collaboration with the Chinese Academy of Sciences, Meituan constructed the Food2K dataset and proposed a progressive region‑enhancement network for food image recognition. The work was published in T‑PAMI 2023 and includes dataset characteristics, method design, performance comparisons, and transfer‑learning experiments.

GPU Inference Service Architecture Optimization

To address low GPU utilization in online inference services, Meituan split model structures and adopted a micro‑service architecture. For an image detection + classification service, GPU utilization rose from 40 % to 100 % and QPS increased by more than three times.

YOLOv6 Quantization Deployment

A generic quantization pipeline is described that preserves detection accuracy while dramatically boosting inference speed, enabling large‑scale industrial deployment of YOLOv6.

YOLOv6 2.0 Release

The lightweight YOLOv6‑S model reaches 869 FPS in quantized form. Medium and large models (YOLOv6‑M/L) achieve 49.5 %/52.5 % AP on COCO and run at 233 FPS / 121 FPS on a T4 GPU with batch size = 32.

CVPR 2022 Selected Papers

Six Meituan papers were highlighted, covering model compression, video object segmentation, 3D vision positioning, image captioning, model security, and cross‑modal video retrieval.

Short‑Video Content Understanding and Generation

The article outlines how computer‑vision techniques are applied to short‑video data to improve services for users and merchants.

Twins Visual Attention Model (NeurIPS 2021)

Developed jointly with the University of Adelaide, Twins addresses efficiency challenges in visual attention, detailing its design, implementation, and deployment in Meituan scenarios.

ICCV 2021 LargeFineFoodAI Workshop

The workshop focused on large‑scale fine‑grained food analysis using computer vision, aligning with Meituan's Food2K research.

CVPR 2021 Pre‑Lecture

Five Meituan papers were presented, covering instance segmentation, expression recognition, fast image segmentation, feature selection, and alignment, accompanied by video and PPT materials.

ICLR 2021 AutoML Paper (DARTS‑)

The paper discusses robust neural architecture search for AI production, a collaboration with Shanghai Jiao‑Tong University.

CVPR 2019 Trajectory Prediction Competition

Meituan's autonomous delivery and visual team won first place in the trajectory prediction challenge.

ICDAR 2019 Scene Text Detection

A method combining a feature‑pyramid network with 8‑Neighbor connections is presented for robust text detection in natural scenes; the work was published in ICDAR 2019.

Collectively, these entries illustrate Meituan's end‑to‑end visual AI pipeline, from foundational research and dataset creation to model optimization, deployment, and real‑world impact.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision deep learning GPU inference CVPR YOLOv6 Food2K

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.