AFAC2023 Financial Intelligence Challenge Highlights and the Release of the Fin‑Eval Dataset
The inaugural AFAC2023 Financial Intelligence Challenge, co‑organized by the China Computer Federation and Ant Group, attracted over 4,700 teams, showcased cutting‑edge AI solutions for finance such as market opinion generation, compliance detection, and pet‑age recognition, and culminated in the public launch of the Fin‑Eval benchmark dataset for financial large‑model evaluation.
Recently, under the guidance of the China Computer Federation and Ant Group, the first "AFAC2023 Financial Intelligence Challenge" was held jointly by Ant Wealth, Ant Insurance, MyBank, Zhejiang University, Shanghai Jiao Tong University, Xi'an Jiaotong University, Central University of Finance and Economics, Ant Technology Research Institute, and the Tianchi platform. After intense competition, six champion teams from Peking University, Tongji University, South China University of Technology and other institutions were announced.
On September 8, the six champion teams were invited to the 2023 INCLUSION·Bund Conference forum "Intelligent Emergence, the Path of FinTech Evolution in the Era of Large Models" and received their awards on stage.
Ant Group designed the competition around three core directions—financial data verification, financial data understanding, and financial scenario understanding—setting six tasks that emphasized the practical deployment of AI in real financial scenarios, raising both task complexity and model evaluation rigor.
Since the competition started in June, it attracted 4,728 teams from top universities such as Tsinghua, Peking, Shanghai Jiao Tong, Zhejiang, Huazhong, Fudan, Renmin, Xi'an Jiaotong, Wuhan, Sun Yat‑sen, Tianjin, Central University of Finance and Economics, East China Normal, Tongji, and South China University of Technology, as well as industry participants from China Merchants Bank, Shanghai Pudong Development Bank, ZheShang Bank, Meituan, Huawei, China Unicom, China Mobile, and others.
After nearly three months of fierce competition, 36 teams emerged from the 4,000‑plus entrants, ranging from fresh postgraduate students to doctoral researchers, corporate engineers, and even newly‑wed couples. They presented their solutions at the review forum, receiving targeted feedback from expert judges.
Ant Group Vice President and competition chair Wang Xiaohang said: "For years Ant Group has built national‑level products such as Yu‑E‑Bao, Huabei, and Small‑Micro Finance, continuously investing in AI technologies like knowledge graphs, optimization, graph learning, trustworthy AI, and large models. This competition aims to tackle real‑world AI challenges in the financial industry and bring top academic talent close to industry problems."
What made the winners stand out?
Taking the task "Financial Market Opinion Generation and Compliance Detection" as an example, participants had to generate accurate, diverse, and compliant communication scripts that align with market data while remaining understandable to novice investors.
The difficulty lies in simultaneously satisfying accuracy (matching fund indicators and events), diversity (producing varied scripts for the same input), and compliance (meeting regulatory review standards), all while preserving a professional tone.
The champion team from Tongji University, "WeLearnNLP," built their solution on the ChatGLM2‑6B base model, fine‑tuned it with QLoRA, and crafted diverse prompts using multiple large‑model dialogue services to achieve high accuracy and variety, earning top‑rank and strong praise from the judges.
Figure caption: Partial solution of the "WeLearnNLP" team.
Beyond generative tasks, the competition also featured innovative challenges such as "Pet Age Recognition" for Ant Insurance's pet‑insurance product. The Zhejiang University team "VIPA" tackled this vision task by employing TrivialAugment for data augmentation, selecting ConvNeXt‑V2‑Huge as the backbone, and using a convolution‑enhanced regression head with Adam optimizer and CosineAnnealingLR scheduler. Model‑weight averaging, Model Soup, and SWA further boosted robustness, earning them a second‑place finish.
Figure caption: Partial solution of the "VIPA" team.
Scoring Judges Financial‑Specific Task Evaluation Set "Fin‑Eval"
This competition marks a small step for AI in finance. As new open‑source models, fine‑tuning methods, and development tools emerge, the application prospects broaden. However, large‑scale evaluation tools for finance remain under‑developed. To address this, Ant Group's fintech team released the "Fin‑Eval" benchmark, a simulation test suite for financial large models containing over 20,000 evaluation items across 28 sub‑tasks in cognition, generation, domain knowledge, financial logic, and safety/compliance.
Ant Group Financial Large‑Model Algorithm Leader Yu Fei explained: "Fin‑Eval comprises five categories—cognition, generation, domain knowledge, financial logic, and safety/compliance—with 28 sub‑tasks, providing a high‑quality, comprehensive evaluation set that fills a critical industry gap."
Fin‑Eval was designed with financial model characteristics in mind, including in‑context learning, tool calling, and chain‑of‑thought reasoning. It proved valuable during the competition for fair, objective comparison of team results. Future versions will be opened early to AFAC participants and aim to become the industry’s gold standard for financial AI evaluation.
The dataset is now publicly available at:
https://github.com/alipay/financial_evaluation_dataset/
https://huggingface.co/datasets/Fin-Eval/Fin-Eval
Professor Bao Hujun of Zhejiang University remarked: "Events like this promote academic‑industry dialogue in financial AI, cultivating talent with innovative thinking and practical ability."
Ant Group reaffirmed its commitment to advancing frontier fintech and nurturing talent, emphasizing future work on data‑system construction, multimodal large‑model development, and robust evaluation standards to create a safe, healthy AI ecosystem that ultimately enhances user experience.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.