Artificial Intelligence 10 min read

Overview of Haro Intelligent Customer Service: Algorithms, Challenges, and AI Solutions

Haro’s intelligent customer service combines a smart FAQ recommender and a conversational chatbot that leverages matching‑based intent recognition, large‑scale domain pre‑training, metric‑learning for new intents, and fine‑tuned generative LLMs, achieving 82 % top‑1 accuracy while reducing human workload and outlining future API‑orchestrated, multimodal AI enhancements.

HelloTech
HelloTech
HelloTech
Overview of Haro Intelligent Customer Service: Algorithms, Challenges, and AI Solutions

Overall Introduction Haro’s intelligent customer service platform provides users with a smart FAQ recommendation and a conversational chatbot. Standard questions are curated by business experts, while unresolved issues are handled through an IM channel that matches user intents against a knowledge base.

User and Algorithm Pain Points The current challenges include: (1) time‑consuming knowledge‑base updates; (2) difficulty of transferring models across business domains; (3) complex multimodal data fusion; (4) long‑range dependencies in multi‑turn task‑oriented dialogs.

User Journey in Haro’s Intelligent Service When a user accesses the service, the system predicts the likely order‑related query and offers “you may ask” suggestions and self‑service options. If these do not resolve the issue, the user is routed to a chatbot that performs query completion, precise matching, classification, matching, and heuristic QA. When the bot fails, NLP‑assisted human agents take over, using intelligent routing, real‑time solution recommendation, and dialogue script suggestions. A crowdsourced cloud‑service system handles about 70% of the workload.

Case Study: Intent Recognition – Matching vs. Classification Models Classification models suffer from poor adaptability to knowledge‑base changes, high maintenance cost, and lack of training data for new standard questions. Matching models respond quickly to knowledge‑base updates, can recognize intents without additional training data, and are easier to migrate across businesses. After extensive experiments, the matching model achieved a top‑1 accuracy of 82.21% on a specific business line, surpassing the previous classification baseline.

Optimization Measures that Boost Accuracy

Contrastive loss provided modest gains.

Large‑scale domain pre‑training, high‑quality data augmentation, and input masking were most effective.

Increasing sentence length and temperature helped slightly; sampling strategies had limited impact.

Successes vs. Failed Attempts

Successes: surpassing the online fastText classifier (top‑1 82.21% > 80.62%); high QPS intent recognition; pre‑training and data quality as primary drivers; incorporation of CV tricks.

Failed attempts: various loss functions (triplet, BPR), hyper‑parameter tuning, alternative architectures (CNN, ALBERT, SentBERT, ESIM), and different masking or concatenation strategies.

Next Steps and Insights Hard negative examples (close negatives) need to be pushed farther apart. Similar samples across different intents are scarce, making discrimination difficult. Inspired by SimCSE, dropout can generate more hard negatives.

Case Study: Metric Learning for New Intent Discovery To detect emerging user intents, a decision‑boundary approach treats known intents as k classes and the (k+1)th class as “unknown”. Adaptive boundary radii are learned, and triplet loss with hardest positive/negative sampling improves representation isotropy. Experiments on public datasets (Snips, Banking, OOS) with varying proportions of known/unknown intents showed the proposed method achieving the highest overall accuracy and F1 for unknown intents.

Case Study: Generative Models for NLP Tasks Generative LLMs (e.g., ChatGLM‑6B) are fine‑tuned on Haro’s domain knowledge to assist human agents. Evaluation of open‑source base models highlighted ChatGLM‑6B as a good trade‑off between size, latency (≈2 s on A100), and Chinese language support.

Business‑Level Optimizations

Prompt engineering improved entity recognition but reduced instruction compliance.

Integrating GPT‑4 style Chinese prompts and P‑tuning raised compliance at the cost of latency.

Domain‑specific fine‑tuning boosted entity and matching accuracy, though occasional hallucinations persisted.

Increasing high‑quality training data dramatically improved matching accuracy and response controllability.

Future Outlook Knowledge‑base‑driven intent recognition is mature; further domain‑specific fine‑tuning will reduce human workload. Generative large models will not yet fully replace human solutions due to multimodal complexity (text, images, order status, user profile, geo‑trajectory, click behavior, coupons, etc.). TaskMatrix‑style API orchestration is a promising direction, but reliable LLM‑driven API selection remains challenging.

AIcustomer servicelarge language modelNLPintent recognitionmetric learning
HelloTech
Written by

HelloTech

Official Hello technology account, sharing tech insights and developments.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.