An Adaptive Framework for Confidence-Constraint Rule Set Learning in Large Datasets
The paper introduces a constraint‑adaptive rule‑set learning framework (CRSL) that combines a constraint‑aware decision‑tree miner (CARM), a rule‑sorting filter, and a Bayesian rule‑combination selector (CBRS), achieving superior performance and interpretability on benchmark and massive industrial fraud‑detection data and being deployed in Alipay’s risk‑analysis platform.
In response to the growing need for interpretable and fair machine‑learning models in industry, the authors propose a Constraint‑Adaptive Rule‑Set Learning (CRSL) framework that jointly optimizes confidence, coverage, interpretability, and performance under business constraints.
The overall architecture consists of three modules: a rule miner (CARM) that builds constraint‑aware decision trees, a rule‑sorting component that filters candidates using weighted relative accuracy, redundancy, and stability metrics, and a rule‑subset selection module that employs a constraint‑adaptive Bayesian predictive model (CBRS) with a custom likelihood and penalty to satisfy confidence constraints.
CARM extends traditional decision‑tree splitting by incorporating both purity and a constraint‑adaptation fitness term, allowing the tree to respect business‑specified thresholds such as minimum accuracy or loss rates.
The rule‑sorting stage uses three filters: a performance filter based on WRAcc, a redundancy filter based on average redundancy scores, and a stability filter that compares rule performance on training and validation sets.
For rule combination, the Bayesian model defines priors over the number of rules, rule length, and rule size, favoring shorter rule sets for better interpretability. A constraint‑adaptive likelihood combines weighted likelihood with a penalty term that nullifies rule sets violating confidence constraints.
Parameter estimation is performed via MAP inference using a Markov‑Chain Monte Carlo (MCMC) search that iteratively adds, deletes, or swaps rules, guided by each rule’s contribution to the posterior.
The framework has been deployed in Alipay’s fraud‑decision center, where it processes millions of transactions, supports over 5,000 rule‑learning tasks, and improves coverage by 3%–15% across more than 60 risk‑analysis scenarios.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.