Dual‑Phase RL‑LLM Framework DARA for Few‑Shot Online Advertising Budget Allocation
The DARA framework splits online advertising budget allocation into a few‑shot LLM reasoning stage and a fine‑grained optimizer stage, enhanced by a dynamically updated RL‑fine‑tuning algorithm (GRPO‑Adaptive), achieving significantly lower ROI variance than traditional baselines in both real and simulated environments.
