Tagged articles
689 articles
Page 7 of 7
DataFunSummit
DataFunSummit
Jul 25, 2021 · Artificial Intelligence

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

This article details Zhihu Search's engineering solutions for long‑tail query challenges, covering historical development, term weighting, synonym expansion, query rewriting with reinforcement learning, and semantic recall using BERT‑based models, while also outlining future research directions such as GAN‑based rewriting and lightweight pre‑training.

BERTEmbedding RetrievalQuery Rewriting
0 likes · 14 min read
Advances in Query Understanding and Semantic Retrieval at Zhihu Search
DataFunTalk
DataFunTalk
Jun 15, 2021 · Artificial Intelligence

Personalized Approximate Pareto-Efficient Recommendation (PAPERec): A Multi‑Objective Reinforcement Learning Framework for User‑Level Objective Personalization

The paper introduces PAPERec, a personalized multi‑objective recommendation framework that leverages Pareto‑oriented reinforcement learning to generate user‑specific objective weights, enabling the model to approximate Pareto‑optimal solutions and achieve superior click‑through rate and dwell‑time performance in both offline and online experiments.

CTRPareto efficiencyRecommendation Systems
0 likes · 12 min read
Personalized Approximate Pareto-Efficient Recommendation (PAPERec): A Multi‑Objective Reinforcement Learning Framework for User‑Level Objective Personalization
Alimama Tech
Alimama Tech
Jun 10, 2021 · Artificial Intelligence

Overview of Recent Alibaba Mama Research Papers Presented at KDD 2021 on Advertising and AI

At KDD 2021, Alibaba Mama presented six papers that introduced a unified constrained‑bidding solution, a deep‑learnable auction mechanism, real‑negative training for delayed‑feedback CVR, a contextual‑bandit advertising strategy recommender, a multi‑agent cooperative bidding game, and an uncertainty‑aware exploration model, all of which have been deployed to boost platform revenue and advertiser performance.

AlibabaAuction MechanismsKDD
0 likes · 16 min read
Overview of Recent Alibaba Mama Research Papers Presented at KDD 2021 on Advertising and AI
Laiye Technology Team
Laiye Technology Team
Jun 8, 2021 · Artificial Intelligence

Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue Systems

This paper presents a hierarchical reinforcement learning approach that jointly trains dialogue policy and natural language generation modules for task-oriented dialogue systems, achieving state‑of‑the‑art performance on MultiWOZ 2.0 and 2.1 while preserving response fluency.

MultiWOZdialogue policyhierarchical RL
0 likes · 10 min read
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue Systems
DataFunTalk
DataFunTalk
Apr 24, 2021 · Artificial Intelligence

Intelligent Advertising Delivery System and Techniques: From Budget‑Constrained Bidding to Multi‑Channel Optimization

This article systematically introduces Alibaba's advertising intelligence platform, covering the evolution from basic CPM/CPC models to advanced OCPC/OCPM, budget‑constrained bidding, multi‑constraint bidding, sequence‑based long‑term value bidding, multi‑channel allocation, and the AI‑driven Smart Bidding product, highlighting algorithmic foundations, practical implementations, and performance gains.

AdvertisingMachine LearningMulti‑Channel
0 likes · 32 min read
Intelligent Advertising Delivery System and Techniques: From Budget‑Constrained Bidding to Multi‑Channel Optimization
DataFunSummit
DataFunSummit
Mar 25, 2021 · Artificial Intelligence

An Overview of Reinforcement Learning: Concepts, Applications, Challenges, and Future Prospects

Reinforcement learning, a branch of artificial intelligence, is explained through its core concepts, successful case studies such as AlphaGo and AlphaStar, practical application workflows, current challenges, resources, and future outlook, offering a comprehensive guide for researchers and practitioners.

ApplicationsArtificial IntelligencePolicy Optimization
0 likes · 56 min read
An Overview of Reinforcement Learning: Concepts, Applications, Challenges, and Future Prospects
DataFunTalk
DataFunTalk
Mar 9, 2021 · Artificial Intelligence

Introduction to Common Machine Learning Algorithms with Python Implementations

This article introduces the three main categories of machine learning—supervised, unsupervised, and reinforcement learning—detailing common algorithms such as Linear Regression, Logistic Regression, Naive Bayes, K‑Nearest Neighbors, Decision Trees, Random Forests, SVM, K‑Means, and PCA, and provides concise Python code examples using scikit‑learn for each.

Machine LearningPythonUnsupervised Learning
0 likes · 18 min read
Introduction to Common Machine Learning Algorithms with Python Implementations
DataFunTalk
DataFunTalk
Feb 24, 2021 · Artificial Intelligence

Multi‑Objective Ranking in Kuaishou Short‑Video Recommendation: System Design and Online Results

This article details Kuaishou's multi‑objective ranking pipeline for short‑video recommendation, covering manual score fusion, GBDT ensemble, Learn‑to‑Rank, online auto‑tuning, ensemble sorting, reinforcement‑learning rerank, and on‑device rerank, and reports their impact on DAU, watch time and user interaction.

KuaishouMachine Learningmulti-objective ranking
0 likes · 21 min read
Multi‑Objective Ranking in Kuaishou Short‑Video Recommendation: System Design and Online Results
Architects' Tech Alliance
Architects' Tech Alliance
Jan 29, 2021 · Artificial Intelligence

Comprehensive Overview of Machine Learning: Types, Industry Chain, and Key Technologies

This article provides a detailed introduction to machine learning, covering its definition, learning modes such as supervised, unsupervised and reinforcement learning, shallow versus deep learning, the full industry chain from AI chips to cloud and big‑data services, and the major open‑source frameworks and platforms driving the field.

AI chipsBig DataMachine Learning
0 likes · 11 min read
Comprehensive Overview of Machine Learning: Types, Industry Chain, and Key Technologies
Programmer DD
Programmer DD
Jan 3, 2021 · Artificial Intelligence

How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship

After a nostalgic gaming session, the author recounts how Tencent’s upgraded AI, WeKick, leveraged self‑play reinforcement learning, GAIL‑based adversarial simulation, and a multi‑style League framework to dominate the inaugural Google Football Kaggle competition, illustrating the escalating complexity of multi‑agent AI in real‑time strategy games.

GAILKaggle competitionMulti-Agent Systems
0 likes · 8 min read
How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship
DataFunTalk
DataFunTalk
Dec 23, 2020 · Artificial Intelligence

Advances in Knowledge Graph Completion: Methods, Challenges, and Future Directions

This article reviews the rapid progress of knowledge graph completion, covering its background, formal problem definition, major technical approaches—including representation learning, path‑based search, reinforcement learning, logical reasoning, and meta‑learning—while discussing their challenges, recent improvements, and promising future research directions.

CompletionLogical ReasoningMeta Learning
0 likes · 14 min read
Advances in Knowledge Graph Completion: Methods, Challenges, and Future Directions
JD Cloud Developers
JD Cloud Developers
Dec 21, 2020 · Artificial Intelligence

Weekly Tech Highlights: AI Chip, Cloud Forecasts, Docker M1 Preview & More

This week’s developer newsletter spotlights the Chinese Academy of Sciences’ pioneering GNN accelerator chip, IDC’s ten cloud computing predictions for China, the booming IoT market and 5G dominance, Docker’s M1‑compatible desktop preview, a carbon‑nanotube transistor breakthrough, IBM’s FHE initiative, and recent AI research on lifelong learning and reinforcement learning exploration.

Artificial IntelligenceDockerIoT
0 likes · 7 min read
Weekly Tech Highlights: AI Chip, Cloud Forecasts, Docker M1 Preview & More
DataFunTalk
DataFunTalk
Nov 12, 2020 · Artificial Intelligence

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

This article explains how reinforcement learning, with its focus on maximizing long‑term reward, can improve recommendation system mixing by covering basic RL concepts, differences from supervised learning, multi‑armed bandit approaches, practical OpenAI Gym experiments, new AUC metrics, online gains, and advanced model optimizations.

Artificial IntelligenceOpenAI GymQ-Learning
0 likes · 10 min read
Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation
Didi Tech
Didi Tech
Oct 10, 2020 · Artificial Intelligence

Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing

DiDi’s route engine, handling over 40 billion daily requests, replaces static graph algorithms with a deep‑reinforcement‑learning system that first learns intersection decisions via behavior‑cloning LSTM models and then refines them through self‑play Q‑learning, using beam‑search decoding to produce globally optimal, low‑deviation routes for ride‑hailing.

AIBeam SearchRoute Planning
0 likes · 12 min read
Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing
DataFunTalk
DataFunTalk
Oct 4, 2020 · Artificial Intelligence

Reinforcement Learning for Product Ranking: Model Design, Experiments, and Online Deployment

This article presents a comprehensive study of using reinforcement learning to improve e‑commerce product ranking, covering the limitations of traditional scoring, the design of context‑aware models, a pointer‑network based sequence generator, various RL algorithms, extensive offline evaluations, and successful online deployment with future research directions.

PPOdeep learninge-commerce
0 likes · 28 min read
Reinforcement Learning for Product Ranking: Model Design, Experiments, and Online Deployment
Sohu Tech Products
Sohu Tech Products
Sep 16, 2020 · Artificial Intelligence

Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions

This article reviews the latest advances in open-domain dialogue systems, covering classification, end‑to‑end generation challenges, knowledge‑controlled generation, automated evaluation, large‑scale latent‑space models such as PLATO, and outlines future research directions for building more coherent and controllable conversational AI.

Dialogue Systemsevaluationknowledge grounding
0 likes · 14 min read
Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions
MaGe Linux Operations
MaGe Linux Operations
Sep 9, 2020 · Artificial Intelligence

Master Machine Learning Basics: Concepts, Types, Algorithms & K‑NN Walkthrough

This comprehensive tutorial introduces machine learning fundamentals, its history, differences from traditional programming, key characteristics, and why Python is the preferred language, then explores supervised, unsupervised, and reinforcement learning, popular algorithms, detailed K‑Nearest Neighbors examples for classification and regression, and the essential steps to build and evaluate models.

Machine LearningPythonUnsupervised Learning
0 likes · 21 min read
Master Machine Learning Basics: Concepts, Types, Algorithms & K‑NN Walkthrough
DataFunTalk
DataFunTalk
Aug 15, 2020 · Artificial Intelligence

Dynamic Knapsack Optimization for Multi‑Channel Sequential Advertising Using Long‑Term Value

The article presents a novel multi‑channel sequential advertising framework that models budget‑constrained GMV optimization as a dynamic knapsack problem, introduces a long‑term value‑based RL solution (MSBCB), and validates its superiority through extensive offline and online experiments showing up to 10% ROI improvement.

Advertisingbudget optimizationdynamic knapsack
0 likes · 16 min read
Dynamic Knapsack Optimization for Multi‑Channel Sequential Advertising Using Long‑Term Value
Aotu Lab
Aotu Lab
Jul 22, 2020 · Frontend Development

How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer

This article explains how reinforcement‑learning (Q‑learning) can generate mock interface data for regression testing, how Puppeteer automates UI interactions, and how a DSL‑plus‑runtime approach enables scalable pop‑up components, reducing testing costs in complex e‑commerce interactions.

Frontend TestingPuppeteerQ-Learning
0 likes · 8 min read
How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer
DataFunTalk
DataFunTalk
Jul 21, 2020 · Artificial Intelligence

WeChat "Look" Recommendation System: Architecture, Modeling, and Engineering Challenges

This article details the end‑to‑end technical architecture of WeChat's "Look" personalized recommendation service, covering data collection, recall, multi‑stage ranking, various CTR and multi‑objective models, reinforcement‑learning based mixing, diversity optimization, and the engineering hurdles overcome to deploy these solutions at massive scale.

CTR predictionWeChat AIdeep learning
0 likes · 17 min read
WeChat "Look" Recommendation System: Architecture, Modeling, and Engineering Challenges
58 Tech
58 Tech
Jul 8, 2020 · Artificial Intelligence

Budget Pacing Techniques and Their Application in 58.com Advertising Platform

This article introduces mainstream budget‑pacing methods for cost‑per‑click online ads, describes the 58.com business scenarios, details the pacing algorithm—including bid modification, probabilistic throttling, and reinforcement‑learning approaches—explains system design with PID control, and presents online experimental results and future directions.

Ad TechPID controlbudget allocation
0 likes · 14 min read
Budget Pacing Techniques and Their Application in 58.com Advertising Platform
Taobao Frontend Technology
Taobao Frontend Technology
Jun 30, 2020 · Frontend Development

How Reinforcement Learning Powers Front‑End Testing for Alibaba’s 618 Interactive Game

This article explains how the Taobao front‑end team tackled the complexity of the 618 interactive game by using reinforcement‑learning‑driven intelligent testing, Puppeteer‑based automated regression, and a decoupled UI‑logic architecture for scalable popup production, dramatically improving development efficiency and stability.

PuppeteerUI logic decouplingautomated testing
0 likes · 10 min read
How Reinforcement Learning Powers Front‑End Testing for Alibaba’s 618 Interactive Game
HomeTech
HomeTech
Jun 10, 2020 · Artificial Intelligence

Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications

This article introduces recommender systems and the exploitation‑exploration dilemma, explains common E&E algorithms such as ε‑greedy, Upper‑Confidence‑Bound, and Thompson Sampling, and details their practical deployment for interest‑point eviction, selection, and adaptive recall count optimization in an automotive recommendation platform.

Bandit AlgorithmsEpsilon-GreedyExploitation
0 likes · 10 min read
Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications
DataFunTalk
DataFunTalk
May 15, 2020 · Artificial Intelligence

Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction

The article reviews recent research on representing massive sparse features in click‑through‑rate (CTR) models, introducing Alibaba's Res‑embedding method and Google's Neural Input Search (NIS) approach, and discusses how these techniques improve embedding efficiency and model generalization in large‑scale recommendation systems.

CTR predictionRecommendation Systemsdeep learning
0 likes · 10 min read
Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction
JD Retail Technology
JD Retail Technology
May 13, 2020 · Artificial Intelligence

JD's Two Papers Accepted at IJCAI2020 and SIGIR2020: Hierarchical Reinforcement Learning for Multi‑Goal Recommendation and Attention‑Based pCVR Prediction

JD announced that two of its research papers—one on a hierarchical reinforcement‑learning framework for multi‑objective recommendation (MaHRL) and another on an attention‑based model for delayed‑feedback conversion‑rate prediction (pCVR)—were accepted as full papers at the prestigious IJCAI2020 and SIGIR2020 conferences, highlighting the company's strong AI capabilities.

Artificial IntelligenceRecommendation Systemsconversion rate prediction
0 likes · 6 min read
JD's Two Papers Accepted at IJCAI2020 and SIGIR2020: Hierarchical Reinforcement Learning for Multi‑Goal Recommendation and Attention‑Based pCVR Prediction
Alibaba Cloud Developer
Alibaba Cloud Developer
May 11, 2020 · Artificial Intelligence

How Reinforcement Learning Revolutionizes E‑commerce Product Ranking

This article details the evolution of AliExpress product ranking from simple DNN scoring to advanced reinforcement‑learning re‑ranking, comparing multiple models, exploring context effects, introducing pointer‑network generators, evaluating various RL algorithms, and reporting significant online gains in conversion and GMV.

e-commerceonline experimentsproduct ranking
0 likes · 28 min read
How Reinforcement Learning Revolutionizes E‑commerce Product Ranking
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 24, 2020 · Artificial Intelligence

How Reinforcement Learning Can Supercharge New Media Marketing Strategies

This article examines the limitations of traditional new media marketing, explains reinforcement learning fundamentals, and presents a six‑step technical solution—including problem modeling, algorithm selection, action, state, reward design, and model training—that uses RL to optimize budget allocation and achieve over 35% improvement in campaign effectiveness while reducing costs.

AIbudget optimizationdigital advertising
0 likes · 20 min read
How Reinforcement Learning Can Supercharge New Media Marketing Strategies
360 Quality & Efficiency
360 Quality & Efficiency
Apr 17, 2020 · Artificial Intelligence

Extending APEX for Real Distributed Reinforcement Learning with tf2rl

The article examines the limitations of the single‑machine APEX framework in the tf2rl reinforcement‑learning library, proposes a cross‑machine distributed architecture using middleware such as Redis, compares alternative frameworks like EasyRL, and outlines expected performance gains and future development plans.

APEXOff-PolicyTensorFlow
0 likes · 5 min read
Extending APEX for Real Distributed Reinforcement Learning with tf2rl
DataFunTalk
DataFunTalk
Apr 12, 2020 · Artificial Intelligence

Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems

In this article, Wang Zhe addresses fifteen common questions about recommendation systems, covering topics such as building cross‑domain knowledge, the role of deep reinforcement learning, handling sparse or low‑sample data, offline‑online evaluation, knowledge graphs, graph neural networks, model interpretability, large‑scale ID embedding, and career advice for engineers.

Graph Neural NetworkRecommendation Systemsdeep learning
0 likes · 14 min read
Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems
DataFunTalk
DataFunTalk
Mar 27, 2020 · Artificial Intelligence

Understanding Data Product Layers: Business Value, Data, Algorithms, and Applications

The article explains how data products create business value through application, data, and algorithm layers, using examples like 5G infrared temperature screening and ImageNet, and discusses the roles of experimental design, causal inference, and reinforcement learning in building effective AI‑driven strategies.

Artificial IntelligenceData Productbusiness value
0 likes · 8 min read
Understanding Data Product Layers: Business Value, Data, Algorithms, and Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 14, 2020 · Artificial Intelligence

How Alibaba’s AI Voice Bots Revolutionized Customer Service During the Pandemic

This article explains how Alibaba leveraged AI‑powered voice robots to handle massive outbound call volumes during COVID‑19, detailing the technology stack, real‑world application scenarios across finance and retail, and the future potential of intelligent voice assistants in customer service.

AICustomer Servicenatural language processing
0 likes · 11 min read
How Alibaba’s AI Voice Bots Revolutionized Customer Service During the Pandemic
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 7, 2020 · Artificial Intelligence

Tackling Scalability, Data Scarcity, and Training Efficiency in Dialogue Management Models

This article reviews the evolution of dialogue management models from rule‑based systems to deep‑learning approaches, identifies three major challenges—poor scalability, limited annotated data, and low training efficiency—and surveys recent research solutions including semantic matching, knowledge distillation, hierarchical reinforcement learning, model‑based RL, and human‑in‑the‑loop methods.

Conversational AIdata annotationdialogue management
0 likes · 44 min read
Tackling Scalability, Data Scarcity, and Training Efficiency in Dialogue Management Models
Qunar Tech Salon
Qunar Tech Salon
Feb 5, 2020 · Operations

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges, Models, and Future Directions

The article explains why Didi needs advanced dispatch algorithms, describes the complexities of order‑driver matching from simple one‑to‑one cases to large‑scale bipartite matching, and introduces batch matching, supply‑demand prediction, chain dispatch, and AI‑driven optimizations that together improve global efficiency and user experience.

AIDispatchOperations Research
0 likes · 16 min read
Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges, Models, and Future Directions
Top Architect
Top Architect
Jan 16, 2020 · Artificial Intelligence

A Survey of Neural Architecture Search: Search Spaces, Optimization Strategies, and Recent Results

This article surveys neural architecture search, classifying existing methods, describing common search spaces—including global and cell‑based designs—detailing optimization strategies such as reinforcement learning, evolutionary algorithms, surrogate models, one‑shot and differentiable approaches, and highlighting recent results and trends in the field.

Evolutionary AlgorithmsMachine LearningNAS
0 likes · 13 min read
A Survey of Neural Architecture Search: Search Spaces, Optimization Strategies, and Recent Results
DataFunTalk
DataFunTalk
Jan 2, 2020 · Artificial Intelligence

Improving Zhihu Search: Query Understanding, Term Weighting, Synonym Expansion, Query Rewriting, and Semantic Retrieval

This article details Zhihu's search engineering advances over the past year, covering long‑tail query challenges, term‑weight calculation, synonym expansion, query rewriting with translation models and reinforcement learning, and semantic retrieval using BERT‑based embeddings, while outlining future research directions.

NLPQuery RewritingSearch
0 likes · 14 min read
Improving Zhihu Search: Query Understanding, Term Weighting, Synonym Expansion, Query Rewriting, and Semantic Retrieval
DataFunTalk
DataFunTalk
Dec 16, 2019 · Artificial Intelligence

A Comprehensive Overview of Sequential Recommendation Models and Techniques

This article provides an in-depth overview of sequential recommendation, defining the problem, discussing data preparation, and reviewing various neural architectures—including MLP, CNN, RNN, Temporal CNN, self‑attention, and reinforcement‑learning approaches—while offering practical guidance on model selection and implementation.

CNNRNNSequential Modeling
0 likes · 36 min read
A Comprehensive Overview of Sequential Recommendation Models and Techniques
DataFunTalk
DataFunTalk
Dec 10, 2019 · Artificial Intelligence

Applying Deep Reinforcement Learning (DQN) to the 2048 Game: Experiments and Insights

This article details a series of reinforcement‑learning experiments on the 2048 game, from random baselines through DQN implementations, classical value‑iteration methods, network redesigns, and Monte‑Carlo tree search, highlighting challenges such as reward design, over‑estimation, and exploration while achieving scores up to 34 000 and tiles of 2048.

2048AIDQN
0 likes · 8 min read
Applying Deep Reinforcement Learning (DQN) to the 2048 Game: Experiments and Insights
DataFunTalk
DataFunTalk
Nov 27, 2019 · Artificial Intelligence

Applying Reinforcement Learning and Graph Embedding for Intelligent User Operations in Didi Ride‑Sharing

This article describes how Didi Ride‑Sharing leverages reinforcement learning and graph‑embedding techniques to model and optimize user‑operation marketing, detailing system architecture, algorithm design, experimental ROI improvements, and personalized message delivery for enhanced conversion and cost efficiency.

DidiROIgraph embedding
0 likes · 11 min read
Applying Reinforcement Learning and Graph Embedding for Intelligent User Operations in Didi Ride‑Sharing
AntTech
AntTech
Oct 30, 2019 · Artificial Intelligence

Financial Graph Machine Learning, AutoML, and Multi‑Agent Reinforcement Learning at Ant Financial

Professor Song Le presented at the Cloudwise Conference how Ant Financial leverages large‑scale graph neural networks, automated machine‑learning platforms, and multi‑agent reinforcement learning to model complex financial networks, improve risk control, and drive diverse fintech applications.

Ant FinancialLarge-Scale Graphgraph neural networks
0 likes · 12 min read
Financial Graph Machine Learning, AutoML, and Multi‑Agent Reinforcement Learning at Ant Financial
DataFunTalk
DataFunTalk
Oct 25, 2019 · Artificial Intelligence

Advances and Challenges in Human‑Machine Dialogue: Open‑Domain and Task‑Oriented Systems

This article reviews recent progress and open research problems in human‑machine dialogue, covering both open‑domain chat and task‑oriented systems, with focus on reply quality, decoding, retrieval‑augmented generation, controllable and personalized responses, multi‑turn modeling, reinforcement‑learning strategies, low‑resource NLU, and data augmentation techniques.

Dialogue SystemsResponse Generationnatural language processing
0 likes · 16 min read
Advances and Challenges in Human‑Machine Dialogue: Open‑Domain and Task‑Oriented Systems
Tencent Cloud Developer
Tencent Cloud Developer
Oct 11, 2019 · Cloud Computing

Large-Scale Distributed Reinforcement Learning Solution Based on TKE

The project replaces cumbersome manual management of thousands of heterogeneous CPU and GPU nodes for large‑scale reinforcement learning with a TKE‑based, containerized actor‑learner architecture that automates batch start/stop, provides elastic autoscaling, fault‑tolerant processes, shared model storage, and CI‑driven image deployment, cutting costs by up to two‑thirds while dramatically speeding experiment cycles.

CI/CDCloud NativeKubernetes
0 likes · 14 min read
Large-Scale Distributed Reinforcement Learning Solution Based on TKE
DataFunTalk
DataFunTalk
Sep 30, 2019 · Artificial Intelligence

Reinforcement Learning for Recommender Systems: Challenges, Solutions, and Key Papers

This article reviews recent advances in applying reinforcement learning to recommendation systems, explains the fundamental RL concepts, discusses the specific challenges such as large action spaces, bias, and long‑term reward modeling, and summarizes two influential YouTube papers along with practical insights and future directions.

Off-PolicyTop‑Klong-term reward
0 likes · 13 min read
Reinforcement Learning for Recommender Systems: Challenges, Solutions, and Key Papers
DataFunTalk
DataFunTalk
Sep 19, 2019 · Artificial Intelligence

Alibaba Cloud Xiaomai Dialogue System: Architecture, NLU, Dialogue Management, and User Simulator

This article presents Alibaba's Xiaomai intelligent dialogue platform, detailing its general system architecture, three-tier NLU approaches for zero‑, few‑, and many‑shot scenarios, platform‑centric dialogue management with TaskFlow, robustness and continuous learning mechanisms, and a user simulator for large‑scale data generation and dialogue diagnosis.

dialogue systemmeta-learningnatural language understanding
0 likes · 13 min read
Alibaba Cloud Xiaomai Dialogue System: Architecture, NLU, Dialogue Management, and User Simulator
DataFunTalk
DataFunTalk
Sep 18, 2019 · Operations

Understanding Didi's Ride‑Hailing Dispatch Algorithm: Challenges, Models, and Strategies

This article explains why modern ride‑hailing platforms need advanced dispatch algorithms, describes the underlying order‑allocation problem, explores simple and complex matching scenarios, and introduces batch matching, supply‑demand prediction, chain dispatch, and AI‑driven techniques used by Didi to improve efficiency and fairness.

DispatchRide Hailingdynamic VRP
0 likes · 15 min read
Understanding Didi's Ride‑Hailing Dispatch Algorithm: Challenges, Models, and Strategies
Didi Tech
Didi Tech
Sep 13, 2019 · Artificial Intelligence

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Didi’s ride‑hailing dispatch system has progressed from a simple greedy, first‑come‑first‑served matcher to sophisticated batch, chain, and predictive algorithms that use deep‑learning demand forecasts and reinforcement‑learning optimization to assign drivers under complex business rules, boosting response rates and serving over 30 million daily requests.

AIOptimizationRide Hailing
0 likes · 17 min read
Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 28, 2019 · Artificial Intelligence

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

This article introduces the Exact‑K recommendation problem, highlights its differences from traditional Top‑K approaches, and presents a novel solution combining Graph Attention Networks (GAttN) with Reinforcement Learning from Demonstrations (RLfD), backed by extensive experiments showing superior performance on real-world datasets.

Machine Learningexact-kgraph attention networks
0 likes · 14 min read
Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained
Tencent Cloud Developer
Tencent Cloud Developer
Aug 14, 2019 · Artificial Intelligence

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

From Steve Jobs’s early work at Atari to modern DeepMind breakthroughs, the article traces how video games have grown into a multibillion‑dollar industry that serves as a testbed for AI research, while highlighting current AI techniques for smarter agents, procedural content generation, and the collaborative challenges shaping the future of game development.

Artificial IntelligenceGame DevelopmentMonte Carlo Tree Search
0 likes · 25 min read
From Atari to AI: The Evolution of Video Games and Artificial Intelligence
DataFunTalk
DataFunTalk
Jul 31, 2019 · Artificial Intelligence

Key Characteristics and Practical Improvements of Recommendation Technologies

This article discusses the fundamental traits of recommendation technologies, compares UserCF and ItemCF models, explains matrix factorization and FM, explores negative sampling, CTR/CVR modeling, ensemble methods, and practical considerations such as reinforcement learning and exploration strategies for improving recommendation performance in real-world systems.

matrix factorizationreinforcement learning
0 likes · 11 min read
Key Characteristics and Practical Improvements of Recommendation Technologies
AntTech
AntTech
Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIUnsupervised Learninginformation retrieval
0 likes · 5 min read
Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 12, 2019 · Artificial Intelligence

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

RESA is a real‑time evaluation platform that continuously tests multiple Adaptive Bitrate (ABR) algorithms on live user traffic, introduces a multi‑user QoE metric derived from viewing behavior, reveals trade‑offs between clarity and bandwidth, and proposes the RL‑based ABSbc algorithm to steer bitrate distribution and balance user experience with network cost.

ABRBandwidth ControlQoE
0 likes · 23 min read
Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 27, 2019 · Artificial Intelligence

Generating Personalized E‑commerce Review Replies with Product Information

This paper presents a sequence‑to‑sequence model that fuses product‑detail tables with customer comments, using gated multimodal attention, copy mechanisms and reinforcement learning to automatically produce high‑quality, context‑aware replies for e‑commerce platforms, and validates the approach with extensive experiments on a large Taobao dataset.

Sequence-to-Sequencecopy mechanisme‑commerce
0 likes · 21 min read
Generating Personalized E‑commerce Review Replies with Product Information
Ctrip Technology
Ctrip Technology
Jun 19, 2019 · Artificial Intelligence

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

This article examines the limitations of traditional learning‑to‑rank for Ctrip hotel sorting, introduces reinforcement learning as a remedy, outlines three progressive implementation plans (A, B, C) with algorithm choices and engineering trade‑offs, and presents early experimental findings that demonstrate RL's potential to improve conversion rates.

CtripRLhotel
0 likes · 15 min read
Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results
AntTech
AntTech
Jun 10, 2019 · Artificial Intelligence

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

This article presents a model‑based reinforcement learning framework for recommendation systems that uses a generative adversarial user model to simultaneously learn user behavior dynamics and reward functions, enabling efficient Cascading‑DQN policy learning and achieving superior long‑term user rewards and click‑through rates in experiments.

Artificial IntelligenceCascading DQNGenerative Adversarial Networks
0 likes · 9 min read
Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 1, 2019 · Fundamentals

Must-Read Technical Books Recommended by Alibaba Experts

Alibaba’s senior engineers share their curated list of essential technical books—from software testing and design patterns to AI, machine learning, reinforcement learning, Rust programming, and database architecture—explaining why each title is valuable for developers seeking deeper knowledge and practical insights.

AIDesign PatternsMachine Learning
0 likes · 9 min read
Must-Read Technical Books Recommended by Alibaba Experts
DataFunTalk
DataFunTalk
Mar 8, 2019 · Artificial Intelligence

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

The article presents an in‑depth overview of Alibaba's intelligent service bot Ali Xiaomì, covering its platform evolution, core NLP techniques such as intent recognition and machine reading comprehension, multi‑turn recommendation strategies, transfer‑learning approaches across domains and languages, and future technical challenges.

AImachine reading comprehensionnatural language processing
0 likes · 11 min read
Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning
Tencent Cloud Developer
Tencent Cloud Developer
Jan 17, 2019 · Artificial Intelligence

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Tencent’s industrial practice shows how a large‑scale offline‑nearline‑online “Shield” recommendation architecture, powered by the DeepR framework built on RCaffe, uses deep semantic embeddings, massive neural networks and reinforcement‑learning decisions to handle billions of daily requests, demonstrating that data richness and engineering capability, not model depth alone, drive performance in big‑data recommendation systems.

Big DataNeural NetworkRCaffe
0 likes · 13 min read
Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 15, 2019 · Artificial Intelligence

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

This article details Alibaba.com engineers' application of reinforcement learning, attention mechanisms, and weakly supervised techniques to extract product summaries, improve content quality, and significantly raise SEO rankings, supported by offline experiments, online A/B testing, and future research directions.

AlibabaMachine LearningSEO
0 likes · 16 min read
How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models
DataFunTalk
DataFunTalk
Jan 9, 2019 · Artificial Intelligence

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

This article introduces reinforcement learning fundamentals, contrasts it with supervised learning, and explores its challenges and advantages in natural language processing, including applications such as text classification, relation extraction from noisy data, and weakly supervised topic segmentation, while summarizing key insights and experimental results.

Weak Supervisionnatural language processingreinforcement learning
0 likes · 11 min read
Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 20, 2018 · Artificial Intelligence

How Reinforcement Learning Powers Interactive Search in E‑Commerce

This article explains how reinforcement learning can be modeled and deployed to enable intelligent, interactive product search on e‑commerce platforms, detailing problem definition, system architecture, training methodology, online results, and future research directions.

deep learningdialogue systeme-commerce
0 likes · 17 min read
How Reinforcement Learning Powers Interactive Search in E‑Commerce
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 16, 2018 · Artificial Intelligence

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

This article explains the principles of adaptive bitrate streaming, compares traditional ABR algorithms with a reinforcement‑learning‑based approach, describes its system architecture and training process, and presents QoS evaluation results that show RL‑driven streaming can improve video quality and smoothness.

ABR algorithmsAIQoS evaluation
0 likes · 8 min read
How Reinforcement Learning Transforms Adaptive Bitrate Streaming
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 16, 2018 · Artificial Intelligence

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

This article traces the ten‑year evolution of Alibaba’s e‑commerce search system, detailing four major stages—from the early Pora streaming engine to dual‑link real‑time architectures, the integration of deep and reinforcement learning, and the shift to large‑scale online deep learning—while highlighting the technical drivers and future AI‑enabled search vision.

Machine LearningOnline LearningSearch
0 likes · 16 min read
How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI
Meituan Technology Team
Meituan Technology Team
Nov 15, 2018 · Artificial Intelligence

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Meituan enhanced its homepage “Guess You Like” recommendation slot by modeling user‑item interactions as a Markov Decision Process and applying an improved DDPG reinforcement‑learning agent that adjusts the ranking trade‑off parameter, uses advantage‑based Q decomposition, shares actor‑critic weights, and runs in a real‑time TensorFlow pipeline, delivering consistent lifts in click‑through, dwell time, and depth.

DDPGMDP ModelingOnline Learning
0 likes · 21 min read
Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking
Tencent Cloud Developer
Tencent Cloud Developer
Oct 18, 2018 · Artificial Intelligence

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

This article outlines the essential role of a data scientist and introduces ten fundamental machine‑learning algorithms—including PCA/SVD, OLS and polynomial regression, regularized linear models, K‑Means, logistic regression, SVM, feed‑forward, convolutional and recurrent neural networks, CRFs, ensemble trees, and reinforcement‑learning methods—while linking to popular Python libraries and tutorials.

AlgorithmsDecision TreesPCA
0 likes · 10 min read
10 Machine Learning Algorithms You Should Know to Become a Data Scientist
Sohu Tech Products
Sohu Tech Products
Oct 10, 2018 · Artificial Intelligence

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

This article explains how reinforcement learning, specifically the DDPG algorithm combined with Transformer-based networks, is applied to improve large‑scale news recall systems, detailing the business scenario, algorithm selection, model architecture, speed optimizations, training challenges, and observed online performance gains.

AIDDPGTransformer
0 likes · 13 min read
Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture
DataFunTalk
DataFunTalk
Sep 27, 2018 · Artificial Intelligence

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

The article presents a comprehensive overview of how Shumei Technology leverages machine learning—including supervised, unsupervised, and reinforcement learning methods—across its credit scoring, fraud detection, advertising, and audio content moderation services, highlighting practical challenges, model fusion techniques, and future research directions.

Model Fusionreinforcement learning
0 likes · 12 min read
Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases
JD Tech
JD Tech
Sep 12, 2018 · Artificial Intelligence

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

The article details JD's third‑generation autonomous delivery robots, covering their multi‑sensor fusion localization, deep‑learning perception, reinforcement‑learning motion control, extensive patent portfolio, and upcoming technical hurdles such as high‑precision mapping and lidar cost, while also inviting public voting for patent awards.

AI navigationJD Logisticsautonomous robots
0 likes · 8 min read
JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2018 · Artificial Intelligence

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

This article analyzes OpenAI’s Retro Contest on Sonic the Hedgehog, explains why reinforcement learning generalization is crucial for AGI, and details the winning team’s joint PPO pipeline, engineering optimizations, training strategies, and final performance compared to human baselines.

OpenAI Retro ContestRL generalizationSonic game
0 likes · 21 min read
How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 23, 2018 · Fundamentals

Top Technical Books Recommended by Alibaba Experts for World Book Day

On World Book Day, nine Alibaba technology veterans share a curated list of essential technical books—covering software testing, design patterns, AI, machine learning, reinforcement learning, Rust, and database architecture—offering concise reasons why each title is valuable for developers and engineers.

Database ArchitectureDesign PatternsMachine Learning
0 likes · 10 min read
Top Technical Books Recommended by Alibaba Experts for World Book Day
Tencent Cloud Developer
Tencent Cloud Developer
Mar 15, 2018 · Artificial Intelligence

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

The article surveys three recent approaches—Transition State Clustering, Sequential Windowed Inverse Reinforcement Learning, and Deep Discovery of Continuous Options—that automatically segment long‑horizon surgical‑robot demonstrations into sub‑tasks, learn hierarchical policies from limited data, and achieve markedly higher success rates on da Vinci cutting, tension, and needle‑picking tasks.

hierarchical learningimitation learningreinforcement learning
0 likes · 18 min read
Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 5, 2018 · Artificial Intelligence

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

In 2017 Alibaba's AliMe chatbot platform expanded from a single‑company solution to a multilingual, multi‑channel AI service, introducing platform‑level SaaS/PaaS capabilities, a seven‑layer front‑end architecture, modular back‑end design, advanced intent recognition, knowledge‑graph‑driven product management, reinforcement‑learning‑based recommendation, and machine‑reading comprehension for enterprise and consumer use cases.

AI PlatformAlibabaChatbot
0 likes · 23 min read
How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact
Hulu Beijing
Hulu Beijing
Dec 6, 2017 · Artificial Intelligence

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

This article explains how deep reinforcement learning, built upon traditional Q‑learning and enhanced with techniques like experience replay, enables agents to play Atari video games directly from raw pixel inputs, illustrating the key differences, processing steps, and the significance of this breakthrough in AI.

AtariQ-Learningdeep Q‑learning
0 likes · 5 min read
How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery
Hulu Beijing
Hulu Beijing
Dec 5, 2017 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, describing its origins, key components such as agents, environments, states, actions, and rewards, explaining the Markov decision process framework, and highlighting common algorithms like Q‑learning, policy gradients, and actor‑critic methods.

AIAlgorithmsMDP
0 likes · 4 min read
What Is Reinforcement Learning? Core Concepts Explained
Ctrip Technology
Ctrip Technology
Oct 19, 2017 · Artificial Intelligence

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

This article presents a comprehensive overview of Alibaba’s intelligent chatbot “Ali Xiaomi”, covering industry context, e‑commerce deployment, NLU architecture, intent‑matching layers, deep‑learning‑based intent classification, reinforcement‑learning‑driven recommendation, knowledge‑graph‑enhanced services, and hybrid retrieval‑generation dialogue models, with future outlooks for AI‑driven interaction.

deep learninge-commerceknowledge graph
0 likes · 18 min read
Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot
ITPUB
ITPUB
Sep 14, 2017 · Artificial Intelligence

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

Salesforce’s recent research introduces Seq2SQL, a reinforcement‑learning‑driven sequence‑to‑sequence model that translates natural‑language questions into SQL queries, eliminating the need to learn SQL, and includes the large WikiSQL dataset built from crowdsourced NL‑SQL pairs for training and evaluation.

AISQL GenerationSeq2SQL
0 likes · 6 min read
How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning
AntTech
AntTech
Aug 4, 2017 · Artificial Intelligence

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

The article summarizes Dr. Qi Yuan’s presentation at CCAI 2017, detailing Ant Financial’s AI‑driven solutions for financial services—including risk control, intelligent assistants, large‑scale machine learning, reinforcement‑learning marketing, a model‑service platform, and a computer‑vision damage‑assessment system—while highlighting technical challenges, platform architecture, and the company’s open‑tech philosophy.

Artificial IntelligenceFinTechreinforcement learning
0 likes · 16 min read
Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2017 · Artificial Intelligence

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

This article explains how Alibaba's Cainiao network built the STARK VRP engine—an AI‑driven, distributed vehicle‑routing solver that supports dozens of VRP variants, leverages metaheuristics, parallel island models, and deep reinforcement learning to dramatically reduce fleet size and travel distance in Chinese logistics.

AILogistics OptimizationMetaheuristics
0 likes · 8 min read
How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing
21CTO
21CTO
Jun 29, 2017 · Artificial Intelligence

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

The article explores how machine learning models emulate human learning by converting diverse real‑world descriptions into numerical features, illustrating concepts such as one‑hot encoding, supervised, unsupervised, and reinforcement learning, and emphasizing the importance of mapping inputs to outputs for intelligent systems.

AI conceptsMachine Learningfeatures
0 likes · 14 min read
Why Machine Learning Mirrors Human Learning: From Features to Reinforcement
Qunar Tech Salon
Qunar Tech Salon
Apr 27, 2017 · Artificial Intelligence

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

The paper introduces LSTM‑Jump, a reinforcement‑learning‑trained LSTM variant that can dynamically skip irrelevant tokens, achieving up to six‑fold speed‑ups over standard sequential LSTMs while maintaining or improving accuracy on various NLP tasks such as sentiment analysis, document classification, and question answering.

LSTMNLPSequence Modeling
0 likes · 7 min read
LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling
21CTO
21CTO
Apr 19, 2017 · Artificial Intelligence

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba’s e‑commerce search engine evolved over three years from offline batch models to a sophisticated AI-driven system that integrates real‑time feature ingestion, online learning, deep and reinforcement learning, enabling dynamic personalization and decision‑making that boosts conversion during high‑traffic events like Double 11.

AIOnline LearningReal‑Time Computing
0 likes · 15 min read
How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning
Architect
Architect
Mar 10, 2016 · Artificial Intelligence

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

This article explains Monte Carlo Tree Search (MCTS), covering its origin in AlphaGo, fundamental algorithm steps, node‑selection strategies such as UCB, strengths and weaknesses, enhancements, historical background, and recent research developments in artificial intelligence.

Artificial IntelligenceMCTSMonte Carlo Tree Search
0 likes · 12 min read
Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications
dbaplus Community
dbaplus Community
Mar 9, 2016 · Artificial Intelligence

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

This article breaks down AlphaGo’s breakthrough architecture—four specialized neural‑network modules, Monte‑Carlo Tree Search, and deep reinforcement learning—to explain how the system moved from imitation learning to self‑improvement and ultimately defeated top human Go players.

AlphaGoGo AIMonte Carlo Tree Search
0 likes · 15 min read
How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery
Architects Research Society
Architects Research Society
Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Generative Modelsbayesian inferencemachine translation
0 likes · 4 min read
Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data
Baidu Tech Salon
Baidu Tech Salon
Sep 22, 2014 · Artificial Intelligence

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms

After decades of being deemed a 'century‑long' AI challenge, Baidu’s Bingo system achieved amateur‑to‑professional level Go play by introducing optimized Monte‑Carlo tree search, a weakened Alpha‑Beta hybrid, and massive supervised learning, demonstrating how breakthroughs in game AI can ripple into broader Baidu products.

Artificial IntelligenceBaiduGo AI
0 likes · 8 min read
How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms