Artificial Intelligence 6 min read

ICDAR 2023-DSText Video Text Reading Competition Overview

The ICDAR 2023-DSText competition, launching on February 15, 2023, focuses on dense and small text detection and recognition in video, providing a YouTube‑sourced dataset of 100 videos, two challenge tasks, a detailed timeline, eligibility rules, and a list of international sponsoring institutions.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
ICDAR 2023-DSText Video Text Reading Competition Overview

ICDAR 2023-DSText (Video Text Reading Competition for Dense and Small Text) is an international academic competition that officially started on February 15, 2023. It is part of the ICDAR series, the premier conference in document image analysis and recognition, held biennially since 1991.

The competition emphasizes the challenges of detecting and recognizing dense and small text in video frames, aiming to advance research in natural scene video text recognition. The dataset consists of 50 training and 50 testing videos sourced from YouTube, each 10–30 seconds long, with an average of 23 text instances per frame, far exceeding previous datasets.

Two tasks are offered: (1) Video Text Tracking, requiring models to detect rotated bounding boxes and assign consistent IDs to the same text across frames; (2) End‑to‑End Video Text Recognition, which builds on the tracking task and also demands the recognition output for each text instance.

The competition timeline includes website launch on December 30, 2022; dataset release and competition start on February 15, 2023; test set release on March 15, 2023; submission deadline on March 20, 2023; method description report deadline on March 25, 2023; and result announcement on March 31, 2023.

Eligibility is open worldwide, except for personnel directly involved with the organizers. Participants may register individually or as a team, and any cheating such as manual labeling of test data or using prohibited external datasets will lead to disqualification. Apart from BOVText and RoadText‑1k, all other public datasets are allowed.

Registration details are available on the competition website (https://rrc.cvc.uab.es/?ch=22&com=introduction) and via a QR code. The competition is co‑organized by Zhejiang University, Kuaishou Technology, the Chinese Academy of Sciences, the Computer Vision Center at the Autonomous University of Barcelona, Nanyang Technological University, the Indian Statistical Institute, and Huazhong University of Science and Technology.

Reference materials include demo videos on Bilibili and YouTube, as well as related papers such as Wu et al., "A bilingual, OpenWorld video text dataset and end‑to‑end video text spotter with transformer" (arXiv:2112.04888) and Reddy et al., "Roadtext‑1k: Text detection & recognition dataset for driving videos" (ICRA 2020).

computer visioncompetitiondatasettext detectionICDARvideo OCR
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.