Artificial Intelligence 10 min read

An Adaptive Framework for Confidence-Constraint Rule Set Learning in Large Datasets

The paper introduces a constraint‑adaptive rule‑set learning framework (CRSL) that combines a constraint‑aware decision‑tree miner (CARM), a rule‑sorting filter, and a Bayesian rule‑combination selector (CBRS), achieving superior performance and interpretability on benchmark and massive industrial fraud‑detection data and being deployed in Alipay’s risk‑analysis platform.

AntTech
AntTech
AntTech
An Adaptive Framework for Confidence-Constraint Rule Set Learning in Large Datasets

In response to the growing need for interpretable and fair machine‑learning models in industry, the authors propose a Constraint‑Adaptive Rule‑Set Learning (CRSL) framework that jointly optimizes confidence, coverage, interpretability, and performance under business constraints.

The overall architecture consists of three modules: a rule miner (CARM) that builds constraint‑aware decision trees, a rule‑sorting component that filters candidates using weighted relative accuracy, redundancy, and stability metrics, and a rule‑subset selection module that employs a constraint‑adaptive Bayesian predictive model (CBRS) with a custom likelihood and penalty to satisfy confidence constraints.

CARM extends traditional decision‑tree splitting by incorporating both purity and a constraint‑adaptation fitness term, allowing the tree to respect business‑specified thresholds such as minimum accuracy or loss rates.

The rule‑sorting stage uses three filters: a performance filter based on WRAcc, a redundancy filter based on average redundancy scores, and a stability filter that compares rule performance on training and validation sets.

For rule combination, the Bayesian model defines priors over the number of rules, rule length, and rule size, favoring shorter rule sets for better interpretability. A constraint‑adaptive likelihood combines weighted likelihood with a penalty term that nullifies rule sets violating confidence constraints.

Parameter estimation is performed via MAP inference using a Markov‑Chain Monte Carlo (MCMC) search that iteratively adds, deletes, or swaps rules, guided by each rule’s contribution to the posterior.

The framework has been deployed in Alipay’s fraud‑decision center, where it processes millions of transactions, supports over 5,000 rule‑learning tasks, and improves coverage by 3%–15% across more than 60 risk‑analysis scenarios.

decision treesmachine learningfraud detectionBayesian methodsconstraint optimizationrule learning
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.