Artificial Intelligence 14 min read

T-LEAF: A Taxonomy Learning and Evaluation Framework for Airbnb Community Support Classification System

The T‑LEAF framework introduces quantitative metrics for coverage, usefulness, and consistency to iteratively develop Airbnb’s unified Contact‑Reason taxonomy, enabling faster feedback loops, reducing “Other” classifications, and improving both human annotation agreement and machine‑learning prediction accuracy in production.

Airbnb Technology Team
Airbnb Technology Team
Airbnb Technology Team
T-LEAF: A Taxonomy Learning and Evaluation Framework for Airbnb Community Support Classification System

Introduction This article describes how qualitative learning, human annotation, and machine learning are applied to iteratively develop Airbnb's community support taxonomy.

Background A taxonomy is a knowledge‑organization system that uses textual labels and hierarchical structures to classify information. Airbnb uses taxonomies in both front‑end products (to help guests and hosts find content) and back‑end tools (to structure data and support ML applications). The need for a unified “Contact‑Reason” taxonomy arises from the fragmented, manually‑mapped taxonomies previously used for hosts, guests, community‑support ambassadors, and ML models.

Challenges in Evaluating New Taxonomies Creating new taxonomies without real user data or downstream applications makes it hard to measure quality. Existing processes rely on output metrics, leading to long experiment cycles and lack of coverage for minor changes.

T-LEAF Framework T-LEAF (Taxonomy Learning and Evaluation Framework) quantifies taxonomy quality along three dimensions: coverage, usefulness, and consistency.

Coverage Coverage measures whether the taxonomy captures the full range of real data objects. The coverage score is defined as 1 – (proportion of items classified as “Other” or “Undefined”).

Usefulness Usefulness assesses whether objects are evenly distributed across meaningful categories. Assuming a dataset of n samples, a taxonomy with √n nodes is considered a good balance between coarse and fine granularity. A split score in (0, 1] is computed for any given number of nodes.

Consistency Consistency reflects inter‑rater reliability. Two evaluation methods are used: (1) human‑annotator agreement measured by Cohen’s Kappa, and (2) training accuracy of a machine‑learning model trained on single‑annotator data. High consistency should lead to higher model training accuracy.

Experiments The authors compared the two consistency evaluation methods. Results (Table 1) show similar accuracy and confusion patterns for both approaches, indicating that a clear taxonomy improves both human agreement and model performance.

Impact of T-LEAF on the Contact‑Reason Taxonomy The new taxonomy contains ~200 nodes in a three‑level hierarchy (L1‑L3). Using T-LEAF during development accelerated feedback loops and enabled quantitative quality control before production rollout.

Production Results After deployment, the “Other” label rate dropped from 5.8% to 1.45% (a 5.3% improvement in coverage). New fine‑grained nodes allowed the chatbot to guide users to specific cancellation workflows, increasing self‑service resolution. The ML model built on the new taxonomy achieved a 9% higher prediction accuracy compared to the old taxonomy.

Conclusion T-LEAF provides a quantitative framework that speeds taxonomy iteration, reduces release risk, and benefits all stakeholders (guests, hosts, support agents, and business teams). It evaluates coverage, usefulness, and consistency, and has been successfully applied in Airbnb’s production environment.

References [1] Szopinski et al., 2019. “Because Your Taxonomy is Worth IT…” [2] Carlis & Bruso, 2012. “RSQRT: AN HEURISTIC FOR ESTIMATING THE NUMBER OF CLUSTERS TO REPORT.” [3] Airbnb Engineering, Intelligent Automation Platform. [4] Airbnb Engineering, Task‑Oriented Conversational AI in Airbnb Customer Support.

machine learningdata qualityclassificationAirbnbcommunity supportEvaluation Frameworktaxonomy
Airbnb Technology Team
Written by

Airbnb Technology Team

Official account of the Airbnb Technology Team, sharing Airbnb's tech innovations and real-world implementations, building a world where home is everywhere through technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.