Artificial Intelligence 6 min read

Didi’s Achievements and Innovations at CVPR 2019 AI City Challenge

At CVPR 2019, Didi’s technology team co‑hosted an autonomous‑driving workshop, showcased the D²‑City dataset, and secured second place in the AI City Challenge by introducing a modular multi‑camera tracking framework, a CNN‑based single‑camera tracker, and a staged aggregation strategy, while outlining its hybrid dispatch commercial plan.

Didi Tech
Didi Tech
Didi Tech
Didi’s Achievements and Innovations at CVPR 2019 AI City Challenge

At the 32nd Computer Vision and Pattern Recognition Conference (CVPR 2019) in Long Beach, California, Didi’s technology team co‑hosted an autonomous‑driving workshop with the University of California, Berkeley’s DeepDrive (BDD) Alliance. The event showcased Didi’s research and practice in autonomous driving, and the team earned the second‑place award in the CVPR AI City Challenge.

The CVPR AI City Challenge, organized alongside the conference, focuses on three tasks using the “City Flow” dataset released by NVIDIA: cross‑camera multi‑object vehicle tracking, image‑based vehicle re‑identification, and traffic anomaly detection. The dataset contains video streams from 40 cameras covering intersections, residential areas, and highways, enabling large‑scale cross‑camera tracking and re‑identification.

Among these tasks, cross‑camera multi‑object tracking is the most complex, requiring image‑based re‑identification, single‑camera tracking, and spatio‑temporal analysis across cameras. Vehicle re‑identification is especially challenging due to high intra‑class variation (different viewpoints) and high inter‑class similarity (similar vehicle models).

Didi’s on‑board technology team introduced several innovations: they decoupled the tracking framework into modular components for independent research and easy integration of new modules; they proposed a novel single‑camera tracking algorithm that leverages CNN features and spatio‑temporal information to reduce target loss, outperforming Deep SORT, MOANA, and TC. Additionally, Didi designed a staged aggregation strategy that orders aggregation policies based on spatio‑temporal conditions and uses hyper‑parameter auto‑search to achieve a globally optimal configuration, contributing to their second‑place finish.

Beyond the competition, Didi participated in a joint workshop with BDD on autonomous‑driving perception, presenting the D²‑City dataset—a large‑scale, high‑quality real‑world driving video collection covering 12 categories of traffic and road objects. Compared with existing datasets, D²‑City includes more challenging weather, traffic, and capture conditions. The dataset was used in a transfer‑learning detection challenge, where teams such as Megvii, University of Electronic Science and Technology of China, ETH Zurich, and DeepBlue achieved top rankings.

Didi continues to collaborate with leading academic institutions, including BDD and the Montreal Institute for Learning Algorithms (Mila) led by Yoshua Bengio. At the workshop, Didi’s chief autonomous‑driving engineer, Jia Zhaoyin, detailed the company’s testing progress, noting a team of over 100 engineers conducting road tests in China and the United States, and leveraging extensive ride‑hailing data for model training.

Jia also discussed Didi’s near‑term commercial strategy: a “hybrid dispatch” model that deploys driver‑less vehicles on simple routes while assigning human drivers to complex segments, aiming to accelerate technology maturation while maintaining user experience.

computer visionCVPRdatasetautonomous drivingAI City Challengevehicle tracking
Didi Tech
Written by

Didi Tech

Official Didi technology account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.