Tagged articles

689 articles

Page 7 of 7

Jul 25, 2021 · Artificial Intelligence

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

This article details Zhihu Search's engineering solutions for long‑tail query challenges, covering historical development, term weighting, synonym expansion, query rewriting with reinforcement learning, and semantic recall using BERT‑based models, while also outlining future research directions such as GAN‑based rewriting and lightweight pre‑training.

BERTEmbedding RetrievalQuery Rewriting

0 likes · 14 min read

Advances in Query Understanding and Semantic Retrieval at Zhihu Search

Java Architect Essentials

Jul 21, 2021 · Artificial Intelligence

DouZero: A Simple Monte‑Carlo Based AI that Achieves State‑of‑the‑Art Performance in Dou Dizhu

DouZero, a reinforcement‑learning AI for the Chinese card game Dou Dizhu, combines a Monte‑Carlo value‑network with compact action encoding, trains on a four‑GPU server, and outperforms existing AI baselines, ranks first on Botzone, and even surpasses human play in several metrics.

AIDouZeroMonte Carlo

0 likes · 15 min read

DouZero: A Simple Monte‑Carlo Based AI that Achieves State‑of‑the‑Art Performance in Dou Dizhu

DataFunTalk

Jun 15, 2021 · Artificial Intelligence

Personalized Approximate Pareto-Efficient Recommendation (PAPERec): A Multi‑Objective Reinforcement Learning Framework for User‑Level Objective Personalization

The paper introduces PAPERec, a personalized multi‑objective recommendation framework that leverages Pareto‑oriented reinforcement learning to generate user‑specific objective weights, enabling the model to approximate Pareto‑optimal solutions and achieve superior click‑through rate and dwell‑time performance in both offline and online experiments.

CTRPareto efficiencyRecommendation Systems

0 likes · 12 min read

Personalized Approximate Pareto-Efficient Recommendation (PAPERec): A Multi‑Objective Reinforcement Learning Framework for User‑Level Objective Personalization

Alimama Tech

Jun 10, 2021 · Artificial Intelligence

Overview of Recent Alibaba Mama Research Papers Presented at KDD 2021 on Advertising and AI

At KDD 2021, Alibaba Mama presented six papers that introduced a unified constrained‑bidding solution, a deep‑learnable auction mechanism, real‑negative training for delayed‑feedback CVR, a contextual‑bandit advertising strategy recommender, a multi‑agent cooperative bidding game, and an uncertainty‑aware exploration model, all of which have been deployed to boost platform revenue and advertiser performance.

AlibabaAuction MechanismsKDD

0 likes · 16 min read

Overview of Recent Alibaba Mama Research Papers Presented at KDD 2021 on Advertising and AI

Laiye Technology Team

Jun 8, 2021 · Artificial Intelligence

Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue Systems

This paper presents a hierarchical reinforcement learning approach that jointly trains dialogue policy and natural language generation modules for task-oriented dialogue systems, achieving state‑of‑the‑art performance on MultiWOZ 2.0 and 2.1 while preserving response fluency.

MultiWOZdialogue policyhierarchical RL

0 likes · 10 min read

Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue Systems

DataFunTalk

Apr 24, 2021 · Artificial Intelligence

Intelligent Advertising Delivery System and Techniques: From Budget‑Constrained Bidding to Multi‑Channel Optimization

This article systematically introduces Alibaba's advertising intelligence platform, covering the evolution from basic CPM/CPC models to advanced OCPC/OCPM, budget‑constrained bidding, multi‑constraint bidding, sequence‑based long‑term value bidding, multi‑channel allocation, and the AI‑driven Smart Bidding product, highlighting algorithmic foundations, practical implementations, and performance gains.

AdvertisingMachine LearningMulti‑Channel

0 likes · 32 min read

Intelligent Advertising Delivery System and Techniques: From Budget‑Constrained Bidding to Multi‑Channel Optimization

DataFunSummit

Mar 25, 2021 · Artificial Intelligence

An Overview of Reinforcement Learning: Concepts, Applications, Challenges, and Future Prospects

Reinforcement learning, a branch of artificial intelligence, is explained through its core concepts, successful case studies such as AlphaGo and AlphaStar, practical application workflows, current challenges, resources, and future outlook, offering a comprehensive guide for researchers and practitioners.

ApplicationsArtificial IntelligencePolicy Optimization

0 likes · 56 min read

An Overview of Reinforcement Learning: Concepts, Applications, Challenges, and Future Prospects

DataFunTalk

Mar 9, 2021 · Artificial Intelligence

Introduction to Common Machine Learning Algorithms with Python Implementations

This article introduces the three main categories of machine learning—supervised, unsupervised, and reinforcement learning—detailing common algorithms such as Linear Regression, Logistic Regression, Naive Bayes, K‑Nearest Neighbors, Decision Trees, Random Forests, SVM, K‑Means, and PCA, and provides concise Python code examples using scikit‑learn for each.

Machine LearningPythonUnsupervised Learning

0 likes · 18 min read

Introduction to Common Machine Learning Algorithms with Python Implementations

DataFunTalk

Feb 24, 2021 · Artificial Intelligence

Multi‑Objective Ranking in Kuaishou Short‑Video Recommendation: System Design and Online Results

This article details Kuaishou's multi‑objective ranking pipeline for short‑video recommendation, covering manual score fusion, GBDT ensemble, Learn‑to‑Rank, online auto‑tuning, ensemble sorting, reinforcement‑learning rerank, and on‑device rerank, and reports their impact on DAU, watch time and user interaction.

KuaishouMachine Learningmulti-objective ranking

0 likes · 21 min read

Multi‑Objective Ranking in Kuaishou Short‑Video Recommendation: System Design and Online Results

Architects' Tech Alliance

Jan 29, 2021 · Artificial Intelligence

Comprehensive Overview of Machine Learning: Types, Industry Chain, and Key Technologies

This article provides a detailed introduction to machine learning, covering its definition, learning modes such as supervised, unsupervised and reinforcement learning, shallow versus deep learning, the full industry chain from AI chips to cloud and big‑data services, and the major open‑source frameworks and platforms driving the field.

AI chipsBig DataMachine Learning

0 likes · 11 min read

Comprehensive Overview of Machine Learning: Types, Industry Chain, and Key Technologies

Programmer DD

Jan 3, 2021 · Artificial Intelligence

How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship

After a nostalgic gaming session, the author recounts how Tencent’s upgraded AI, WeKick, leveraged self‑play reinforcement learning, GAIL‑based adversarial simulation, and a multi‑style League framework to dominate the inaugural Google Football Kaggle competition, illustrating the escalating complexity of multi‑agent AI in real‑time strategy games.

GAILKaggle competitionMulti-Agent Systems

0 likes · 8 min read

How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship

DataFunTalk

Dec 23, 2020 · Artificial Intelligence

Advances in Knowledge Graph Completion: Methods, Challenges, and Future Directions

This article reviews the rapid progress of knowledge graph completion, covering its background, formal problem definition, major technical approaches—including representation learning, path‑based search, reinforcement learning, logical reasoning, and meta‑learning—while discussing their challenges, recent improvements, and promising future research directions.

CompletionLogical ReasoningMeta Learning

0 likes · 14 min read

Advances in Knowledge Graph Completion: Methods, Challenges, and Future Directions

JD Cloud Developers

Dec 21, 2020 · Artificial Intelligence

Weekly Tech Highlights: AI Chip, Cloud Forecasts, Docker M1 Preview & More

This week’s developer newsletter spotlights the Chinese Academy of Sciences’ pioneering GNN accelerator chip, IDC’s ten cloud computing predictions for China, the booming IoT market and 5G dominance, Docker’s M1‑compatible desktop preview, a carbon‑nanotube transistor breakthrough, IBM’s FHE initiative, and recent AI research on lifelong learning and reinforcement learning exploration.

Artificial IntelligenceDockerIoT

0 likes · 7 min read

Weekly Tech Highlights: AI Chip, Cloud Forecasts, Docker M1 Preview & More

DataFunTalk

Nov 12, 2020 · Artificial Intelligence

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

This article explains how reinforcement learning, with its focus on maximizing long‑term reward, can improve recommendation system mixing by covering basic RL concepts, differences from supervised learning, multi‑armed bandit approaches, practical OpenAI Gym experiments, new AUC metrics, online gains, and advanced model optimizations.

Artificial IntelligenceOpenAI GymQ-Learning

0 likes · 10 min read

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

Didi Tech

Oct 10, 2020 · Artificial Intelligence

Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing

DiDi’s route engine, handling over 40 billion daily requests, replaces static graph algorithms with a deep‑reinforcement‑learning system that first learns intersection decisions via behavior‑cloning LSTM models and then refines them through self‑play Q‑learning, using beam‑search decoding to produce globally optimal, low‑deviation routes for ride‑hailing.

AIBeam SearchRoute Planning

0 likes · 12 min read

Deep Reinforcement Learning for Route Planning in DiDi Ride‑Hailing

DataFunTalk

Oct 4, 2020 · Artificial Intelligence

Reinforcement Learning for Product Ranking: Model Design, Experiments, and Online Deployment

This article presents a comprehensive study of using reinforcement learning to improve e‑commerce product ranking, covering the limitations of traditional scoring, the design of context‑aware models, a pointer‑network based sequence generator, various RL algorithms, extensive offline evaluations, and successful online deployment with future research directions.

PPOdeep learninge-commerce

0 likes · 28 min read

Reinforcement Learning for Product Ranking: Model Design, Experiments, and Online Deployment

Sohu Tech Products

Sep 16, 2020 · Artificial Intelligence

Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions

This article reviews the latest advances in open-domain dialogue systems, covering classification, end‑to‑end generation challenges, knowledge‑controlled generation, automated evaluation, large‑scale latent‑space models such as PLATO, and outlines future research directions for building more coherent and controllable conversational AI.

Dialogue Systemsevaluationknowledge grounding

0 likes · 14 min read

Open-Domain Dialogue Systems: Current State, Challenges, and Future Directions

MaGe Linux Operations

Sep 9, 2020 · Artificial Intelligence

Master Machine Learning Basics: Concepts, Types, Algorithms & K‑NN Walkthrough

This comprehensive tutorial introduces machine learning fundamentals, its history, differences from traditional programming, key characteristics, and why Python is the preferred language, then explores supervised, unsupervised, and reinforcement learning, popular algorithms, detailed K‑Nearest Neighbors examples for classification and regression, and the essential steps to build and evaluate models.

Machine LearningPythonUnsupervised Learning

0 likes · 21 min read

Master Machine Learning Basics: Concepts, Types, Algorithms & K‑NN Walkthrough

Programmer DD

Aug 31, 2020 · Artificial Intelligence

AI Fighter Falco Beats Human Pilot in Simulated Dogfight – Implications for Military AI

DARPA’s ACE program showcased the AI‑driven fighter Falco, built with the open‑source AdeptRL reinforcement‑learning framework, which defeated an experienced US Air Force instructor in a 1‑v‑1 simulated F‑16 dogfight, highlighting both the promise and current limitations of autonomous combat systems.

AIDARPASimulation

0 likes · 7 min read

AI Fighter Falco Beats Human Pilot in Simulated Dogfight – Implications for Military AI

DataFunTalk

Aug 15, 2020 · Artificial Intelligence

Dynamic Knapsack Optimization for Multi‑Channel Sequential Advertising Using Long‑Term Value

The article presents a novel multi‑channel sequential advertising framework that models budget‑constrained GMV optimization as a dynamic knapsack problem, introduces a long‑term value‑based RL solution (MSBCB), and validates its superiority through extensive offline and online experiments showing up to 10% ROI improvement.

Advertisingbudget optimizationdynamic knapsack

0 likes · 16 min read

Dynamic Knapsack Optimization for Multi‑Channel Sequential Advertising Using Long‑Term Value

Aotu Lab

Jul 22, 2020 · Frontend Development

How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer

This article explains how reinforcement‑learning (Q‑learning) can generate mock interface data for regression testing, how Puppeteer automates UI interactions, and how a DSL‑plus‑runtime approach enables scalable pop‑up components, reducing testing costs in complex e‑commerce interactions.

Frontend TestingPuppeteerQ-Learning

0 likes · 8 min read

How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer

DataFunTalk

Jul 21, 2020 · Artificial Intelligence

WeChat "Look" Recommendation System: Architecture, Modeling, and Engineering Challenges

This article details the end‑to‑end technical architecture of WeChat's "Look" personalized recommendation service, covering data collection, recall, multi‑stage ranking, various CTR and multi‑objective models, reinforcement‑learning based mixing, diversity optimization, and the engineering hurdles overcome to deploy these solutions at massive scale.

CTR predictionWeChat AIdeep learning

0 likes · 17 min read

WeChat "Look" Recommendation System: Architecture, Modeling, and Engineering Challenges

58 Tech

Jul 8, 2020 · Artificial Intelligence

Budget Pacing Techniques and Their Application in 58.com Advertising Platform

This article introduces mainstream budget‑pacing methods for cost‑per‑click online ads, describes the 58.com business scenarios, details the pacing algorithm—including bid modification, probabilistic throttling, and reinforcement‑learning approaches—explains system design with PID control, and presents online experimental results and future directions.

Ad TechPID controlbudget allocation

0 likes · 14 min read

Budget Pacing Techniques and Their Application in 58.com Advertising Platform

Taobao Frontend Technology

Jun 30, 2020 · Frontend Development

How Reinforcement Learning Powers Front‑End Testing for Alibaba’s 618 Interactive Game

This article explains how the Taobao front‑end team tackled the complexity of the 618 interactive game by using reinforcement‑learning‑driven intelligent testing, Puppeteer‑based automated regression, and a decoupled UI‑logic architecture for scalable popup production, dramatically improving development efficiency and stability.

PuppeteerUI logic decouplingautomated testing

0 likes · 10 min read

How Reinforcement Learning Powers Front‑End Testing for Alibaba’s 618 Interactive Game

HomeTech

Jun 10, 2020 · Artificial Intelligence

Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications

This article introduces recommender systems and the exploitation‑exploration dilemma, explains common E&E algorithms such as ε‑greedy, Upper‑Confidence‑Bound, and Thompson Sampling, and details their practical deployment for interest‑point eviction, selection, and adaptive recall count optimization in an automotive recommendation platform.

Bandit AlgorithmsEpsilon-GreedyExploitation

0 likes · 10 min read

Exploitation & Exploration Algorithms in Recommender Systems: ε‑Greedy, UCB, and Thompson Sampling Applications

DataFunTalk

May 15, 2020 · Artificial Intelligence

Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction

The article reviews recent research on representing massive sparse features in click‑through‑rate (CTR) models, introducing Alibaba's Res‑embedding method and Google's Neural Input Search (NIS) approach, and discusses how these techniques improve embedding efficiency and model generalization in large‑scale recommendation systems.

CTR predictionRecommendation Systemsdeep learning

0 likes · 10 min read

Optimizing Sparse Feature Embedding for Large‑Scale Recommendation and CTR Prediction

JD Retail Technology

May 13, 2020 · Artificial Intelligence

JD's Two Papers Accepted at IJCAI2020 and SIGIR2020: Hierarchical Reinforcement Learning for Multi‑Goal Recommendation and Attention‑Based pCVR Prediction

JD announced that two of its research papers—one on a hierarchical reinforcement‑learning framework for multi‑objective recommendation (MaHRL) and another on an attention‑based model for delayed‑feedback conversion‑rate prediction (pCVR)—were accepted as full papers at the prestigious IJCAI2020 and SIGIR2020 conferences, highlighting the company's strong AI capabilities.

Artificial IntelligenceRecommendation Systemsconversion rate prediction

0 likes · 6 min read

JD's Two Papers Accepted at IJCAI2020 and SIGIR2020: Hierarchical Reinforcement Learning for Multi‑Goal Recommendation and Attention‑Based pCVR Prediction

Alibaba Cloud Developer

May 11, 2020 · Artificial Intelligence

How Reinforcement Learning Revolutionizes E‑commerce Product Ranking

This article details the evolution of AliExpress product ranking from simple DNN scoring to advanced reinforcement‑learning re‑ranking, comparing multiple models, exploring context effects, introducing pointer‑network generators, evaluating various RL algorithms, and reporting significant online gains in conversion and GMV.

e-commerceonline experimentsproduct ranking

0 likes · 28 min read

How Reinforcement Learning Revolutionizes E‑commerce Product Ranking

Alibaba Cloud Developer

Apr 24, 2020 · Artificial Intelligence

How Reinforcement Learning Can Supercharge New Media Marketing Strategies

This article examines the limitations of traditional new media marketing, explains reinforcement learning fundamentals, and presents a six‑step technical solution—including problem modeling, algorithm selection, action, state, reward design, and model training—that uses RL to optimize budget allocation and achieve over 35% improvement in campaign effectiveness while reducing costs.

AIbudget optimizationdigital advertising

0 likes · 20 min read

How Reinforcement Learning Can Supercharge New Media Marketing Strategies

360 Quality & Efficiency

Apr 17, 2020 · Artificial Intelligence

Extending APEX for Real Distributed Reinforcement Learning with tf2rl

The article examines the limitations of the single‑machine APEX framework in the tf2rl reinforcement‑learning library, proposes a cross‑machine distributed architecture using middleware such as Redis, compares alternative frameworks like EasyRL, and outlines expected performance gains and future development plans.

APEXOff-PolicyTensorFlow

0 likes · 5 min read

Extending APEX for Real Distributed Reinforcement Learning with tf2rl

DataFunTalk

Apr 12, 2020 · Artificial Intelligence

Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems

In this article, Wang Zhe addresses fifteen common questions about recommendation systems, covering topics such as building cross‑domain knowledge, the role of deep reinforcement learning, handling sparse or low‑sample data, offline‑online evaluation, knowledge graphs, graph neural networks, model interpretability, large‑scale ID embedding, and career advice for engineers.

Graph Neural NetworkRecommendation Systemsdeep learning

0 likes · 14 min read

Wang Zhe’s Machine Learning Notes – Answers to Frequently Asked Questions on Recommendation Systems

DataFunTalk

Mar 27, 2020 · Artificial Intelligence

Understanding Data Product Layers: Business Value, Data, Algorithms, and Applications

The article explains how data products create business value through application, data, and algorithm layers, using examples like 5G infrared temperature screening and ImageNet, and discusses the roles of experimental design, causal inference, and reinforcement learning in building effective AI‑driven strategies.

Artificial IntelligenceData Productbusiness value

0 likes · 8 min read

Understanding Data Product Layers: Business Value, Data, Algorithms, and Applications

Alibaba Cloud Developer

Feb 14, 2020 · Artificial Intelligence

How Alibaba’s AI Voice Bots Revolutionized Customer Service During the Pandemic

This article explains how Alibaba leveraged AI‑powered voice robots to handle massive outbound call volumes during COVID‑19, detailing the technology stack, real‑world application scenarios across finance and retail, and the future potential of intelligent voice assistants in customer service.

AICustomer Servicenatural language processing

0 likes · 11 min read

How Alibaba’s AI Voice Bots Revolutionized Customer Service During the Pandemic

Alibaba Cloud Developer

Feb 7, 2020 · Artificial Intelligence

Tackling Scalability, Data Scarcity, and Training Efficiency in Dialogue Management Models

This article reviews the evolution of dialogue management models from rule‑based systems to deep‑learning approaches, identifies three major challenges—poor scalability, limited annotated data, and low training efficiency—and surveys recent research solutions including semantic matching, knowledge distillation, hierarchical reinforcement learning, model‑based RL, and human‑in‑the‑loop methods.

Conversational AIdata annotationdialogue management

0 likes · 44 min read

Tackling Scalability, Data Scarcity, and Training Efficiency in Dialogue Management Models

Qunar Tech Salon

Feb 5, 2020 · Operations

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges, Models, and Future Directions

The article explains why Didi needs advanced dispatch algorithms, describes the complexities of order‑driver matching from simple one‑to‑one cases to large‑scale bipartite matching, and introduces batch matching, supply‑demand prediction, chain dispatch, and AI‑driven optimizations that together improve global efficiency and user experience.

AIDispatchOperations Research

0 likes · 16 min read

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges, Models, and Future Directions

Top Architect

Jan 16, 2020 · Artificial Intelligence

A Survey of Neural Architecture Search: Search Spaces, Optimization Strategies, and Recent Results

This article surveys neural architecture search, classifying existing methods, describing common search spaces—including global and cell‑based designs—detailing optimization strategies such as reinforcement learning, evolutionary algorithms, surrogate models, one‑shot and differentiable approaches, and highlighting recent results and trends in the field.

Evolutionary AlgorithmsMachine LearningNAS

0 likes · 13 min read

A Survey of Neural Architecture Search: Search Spaces, Optimization Strategies, and Recent Results

DataFunTalk

Jan 2, 2020 · Artificial Intelligence

Improving Zhihu Search: Query Understanding, Term Weighting, Synonym Expansion, Query Rewriting, and Semantic Retrieval

This article details Zhihu's search engineering advances over the past year, covering long‑tail query challenges, term‑weight calculation, synonym expansion, query rewriting with translation models and reinforcement learning, and semantic retrieval using BERT‑based embeddings, while outlining future research directions.

NLPQuery RewritingSearch

0 likes · 14 min read

Improving Zhihu Search: Query Understanding, Term Weighting, Synonym Expansion, Query Rewriting, and Semantic Retrieval

DataFunTalk

Dec 16, 2019 · Artificial Intelligence

A Comprehensive Overview of Sequential Recommendation Models and Techniques

This article provides an in-depth overview of sequential recommendation, defining the problem, discussing data preparation, and reviewing various neural architectures—including MLP, CNN, RNN, Temporal CNN, self‑attention, and reinforcement‑learning approaches—while offering practical guidance on model selection and implementation.

CNNRNNSequential Modeling

0 likes · 36 min read

A Comprehensive Overview of Sequential Recommendation Models and Techniques

DataFunTalk

Dec 10, 2019 · Artificial Intelligence

Applying Deep Reinforcement Learning (DQN) to the 2048 Game: Experiments and Insights

This article details a series of reinforcement‑learning experiments on the 2048 game, from random baselines through DQN implementations, classical value‑iteration methods, network redesigns, and Monte‑Carlo tree search, highlighting challenges such as reward design, over‑estimation, and exploration while achieving scores up to 34 000 and tiles of 2048.

2048AIDQN

0 likes · 8 min read

Applying Deep Reinforcement Learning (DQN) to the 2048 Game: Experiments and Insights

DataFunTalk

Nov 27, 2019 · Artificial Intelligence

Applying Reinforcement Learning and Graph Embedding for Intelligent User Operations in Didi Ride‑Sharing

This article describes how Didi Ride‑Sharing leverages reinforcement learning and graph‑embedding techniques to model and optimize user‑operation marketing, detailing system architecture, algorithm design, experimental ROI improvements, and personalized message delivery for enhanced conversion and cost efficiency.

DidiROIgraph embedding

0 likes · 11 min read

Applying Reinforcement Learning and Graph Embedding for Intelligent User Operations in Didi Ride‑Sharing

AntTech

Oct 30, 2019 · Artificial Intelligence

Financial Graph Machine Learning, AutoML, and Multi‑Agent Reinforcement Learning at Ant Financial

Professor Song Le presented at the Cloudwise Conference how Ant Financial leverages large‑scale graph neural networks, automated machine‑learning platforms, and multi‑agent reinforcement learning to model complex financial networks, improve risk control, and drive diverse fintech applications.

Ant FinancialLarge-Scale Graphgraph neural networks

0 likes · 12 min read

Financial Graph Machine Learning, AutoML, and Multi‑Agent Reinforcement Learning at Ant Financial

DataFunTalk

Oct 25, 2019 · Artificial Intelligence

Advances and Challenges in Human‑Machine Dialogue: Open‑Domain and Task‑Oriented Systems

This article reviews recent progress and open research problems in human‑machine dialogue, covering both open‑domain chat and task‑oriented systems, with focus on reply quality, decoding, retrieval‑augmented generation, controllable and personalized responses, multi‑turn modeling, reinforcement‑learning strategies, low‑resource NLU, and data augmentation techniques.

Dialogue SystemsResponse Generationnatural language processing

0 likes · 16 min read

Advances and Challenges in Human‑Machine Dialogue: Open‑Domain and Task‑Oriented Systems

Tencent Cloud Developer

Oct 11, 2019 · Cloud Computing

Large-Scale Distributed Reinforcement Learning Solution Based on TKE

The project replaces cumbersome manual management of thousands of heterogeneous CPU and GPU nodes for large‑scale reinforcement learning with a TKE‑based, containerized actor‑learner architecture that automates batch start/stop, provides elastic autoscaling, fault‑tolerant processes, shared model storage, and CI‑driven image deployment, cutting costs by up to two‑thirds while dramatically speeding experiment cycles.

CI/CDCloud NativeKubernetes

0 likes · 14 min read

Large-Scale Distributed Reinforcement Learning Solution Based on TKE

DataFunTalk

Sep 30, 2019 · Artificial Intelligence

Reinforcement Learning for Recommender Systems: Challenges, Solutions, and Key Papers

This article reviews recent advances in applying reinforcement learning to recommendation systems, explains the fundamental RL concepts, discusses the specific challenges such as large action spaces, bias, and long‑term reward modeling, and summarizes two influential YouTube papers along with practical insights and future directions.

Off-PolicyTop‑Klong-term reward

0 likes · 13 min read

Reinforcement Learning for Recommender Systems: Challenges, Solutions, and Key Papers

DataFunTalk

Sep 19, 2019 · Artificial Intelligence

Alibaba Cloud Xiaomai Dialogue System: Architecture, NLU, Dialogue Management, and User Simulator

This article presents Alibaba's Xiaomai intelligent dialogue platform, detailing its general system architecture, three-tier NLU approaches for zero‑, few‑, and many‑shot scenarios, platform‑centric dialogue management with TaskFlow, robustness and continuous learning mechanisms, and a user simulator for large‑scale data generation and dialogue diagnosis.

dialogue systemmeta-learningnatural language understanding

0 likes · 13 min read

Alibaba Cloud Xiaomai Dialogue System: Architecture, NLU, Dialogue Management, and User Simulator

DataFunTalk

Sep 18, 2019 · Operations

Understanding Didi's Ride‑Hailing Dispatch Algorithm: Challenges, Models, and Strategies

This article explains why modern ride‑hailing platforms need advanced dispatch algorithms, describes the underlying order‑allocation problem, explores simple and complex matching scenarios, and introduces batch matching, supply‑demand prediction, chain dispatch, and AI‑driven techniques used by Didi to improve efficiency and fairness.

DispatchRide Hailingdynamic VRP

0 likes · 15 min read

Understanding Didi's Ride‑Hailing Dispatch Algorithm: Challenges, Models, and Strategies

Didi Tech

Sep 13, 2019 · Artificial Intelligence

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Didi’s ride‑hailing dispatch system has progressed from a simple greedy, first‑come‑first‑served matcher to sophisticated batch, chain, and predictive algorithms that use deep‑learning demand forecasts and reinforcement‑learning optimization to assign drivers under complex business rules, boosting response rates and serving over 30 million daily requests.

AIOptimizationRide Hailing

0 likes · 17 min read

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Alibaba Cloud Developer

Aug 28, 2019 · Artificial Intelligence

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

This article introduces the Exact‑K recommendation problem, highlights its differences from traditional Top‑K approaches, and presents a novel solution combining Graph Attention Networks (GAttN) with Reinforcement Learning from Demonstrations (RLfD), backed by extensive experiments showing superior performance on real-world datasets.

Machine Learningexact-kgraph attention networks

0 likes · 14 min read

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

Tencent Cloud Developer

Aug 14, 2019 · Artificial Intelligence

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

From Steve Jobs’s early work at Atari to modern DeepMind breakthroughs, the article traces how video games have grown into a multibillion‑dollar industry that serves as a testbed for AI research, while highlighting current AI techniques for smarter agents, procedural content generation, and the collaborative challenges shaping the future of game development.

Artificial IntelligenceGame DevelopmentMonte Carlo Tree Search

0 likes · 25 min read

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

DataFunTalk

Jul 31, 2019 · Artificial Intelligence

Key Characteristics and Practical Improvements of Recommendation Technologies

This article discusses the fundamental traits of recommendation technologies, compares UserCF and ItemCF models, explains matrix factorization and FM, explores negative sampling, CTR/CVR modeling, ensemble methods, and practical considerations such as reinforcement learning and exploration strategies for improving recommendation performance in real-world systems.

matrix factorizationreinforcement learning

0 likes · 11 min read

Key Characteristics and Practical Improvements of Recommendation Technologies

AntTech

Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIUnsupervised Learninginformation retrieval

0 likes · 5 min read

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

iQIYI Technical Product Team

Jul 12, 2019 · Artificial Intelligence

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

RESA is a real‑time evaluation platform that continuously tests multiple Adaptive Bitrate (ABR) algorithms on live user traffic, introduces a multi‑user QoE metric derived from viewing behavior, reveals trade‑offs between clarity and bandwidth, and proposes the RL‑based ABSbc algorithm to steer bitrate distribution and balance user experience with network cost.

ABRBandwidth ControlQoE

0 likes · 23 min read

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

Alibaba Cloud Developer

Jun 27, 2019 · Artificial Intelligence

Generating Personalized E‑commerce Review Replies with Product Information

This paper presents a sequence‑to‑sequence model that fuses product‑detail tables with customer comments, using gated multimodal attention, copy mechanisms and reinforcement learning to automatically produce high‑quality, context‑aware replies for e‑commerce platforms, and validates the approach with extensive experiments on a large Taobao dataset.

Sequence-to-Sequencecopy mechanisme‑commerce

0 likes · 21 min read

Generating Personalized E‑commerce Review Replies with Product Information

Ctrip Technology

Jun 19, 2019 · Artificial Intelligence

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

This article examines the limitations of traditional learning‑to‑rank for Ctrip hotel sorting, introduces reinforcement learning as a remedy, outlines three progressive implementation plans (A, B, C) with algorithm choices and engineering trade‑offs, and presents early experimental findings that demonstrate RL's potential to improve conversion rates.

CtripRLhotel

0 likes · 15 min read

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

AntTech

Jun 10, 2019 · Artificial Intelligence

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

This article presents a model‑based reinforcement learning framework for recommendation systems that uses a generative adversarial user model to simultaneously learn user behavior dynamics and reward functions, enabling efficient Cascading‑DQN policy learning and achieving superior long‑term user rewards and click‑through rates in experiments.

Artificial IntelligenceCascading DQNGenerative Adversarial Networks

0 likes · 9 min read

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

Alibaba Cloud Developer

Apr 1, 2019 · Fundamentals

Must-Read Technical Books Recommended by Alibaba Experts

Alibaba’s senior engineers share their curated list of essential technical books—from software testing and design patterns to AI, machine learning, reinforcement learning, Rust programming, and database architecture—explaining why each title is valuable for developers seeking deeper knowledge and practical insights.

AIDesign PatternsMachine Learning

0 likes · 9 min read

Must-Read Technical Books Recommended by Alibaba Experts

DataFunTalk

Mar 8, 2019 · Artificial Intelligence

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

The article presents an in‑depth overview of Alibaba's intelligent service bot Ali Xiaomì, covering its platform evolution, core NLP techniques such as intent recognition and machine reading comprehension, multi‑turn recommendation strategies, transfer‑learning approaches across domains and languages, and future technical challenges.

AImachine reading comprehensionnatural language processing

0 likes · 11 min read

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

Tencent Cloud Developer

Jan 17, 2019 · Artificial Intelligence

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Tencent’s industrial practice shows how a large‑scale offline‑nearline‑online “Shield” recommendation architecture, powered by the DeepR framework built on RCaffe, uses deep semantic embeddings, massive neural networks and reinforcement‑learning decisions to handle billions of daily requests, demonstrating that data richness and engineering capability, not model depth alone, drive performance in big‑data recommendation systems.

Big DataNeural NetworkRCaffe

0 likes · 13 min read

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Alibaba Cloud Developer

Jan 15, 2019 · Artificial Intelligence

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

This article details Alibaba.com engineers' application of reinforcement learning, attention mechanisms, and weakly supervised techniques to extract product summaries, improve content quality, and significantly raise SEO rankings, supported by offline experiments, online A/B testing, and future research directions.

AlibabaMachine LearningSEO

0 likes · 16 min read

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

DataFunTalk

Jan 9, 2019 · Artificial Intelligence

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

This article introduces reinforcement learning fundamentals, contrasts it with supervised learning, and explores its challenges and advantages in natural language processing, including applications such as text classification, relation extraction from noisy data, and weakly supervised topic segmentation, while summarizing key insights and experimental results.

Weak Supervisionnatural language processingreinforcement learning

0 likes · 11 min read

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

Alibaba Cloud Developer

Nov 20, 2018 · Artificial Intelligence

How Reinforcement Learning Powers Interactive Search in E‑Commerce

This article explains how reinforcement learning can be modeled and deployed to enable intelligent, interactive product search on e‑commerce platforms, detailing problem definition, system architecture, training methodology, online results, and future research directions.

deep learningdialogue systeme-commerce

0 likes · 17 min read

How Reinforcement Learning Powers Interactive Search in E‑Commerce

iQIYI Technical Product Team

Nov 16, 2018 · Artificial Intelligence

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

This article explains the principles of adaptive bitrate streaming, compares traditional ABR algorithms with a reinforcement‑learning‑based approach, describes its system architecture and training process, and presents QoS evaluation results that show RL‑driven streaming can improve video quality and smoothness.

ABR algorithmsAIQoS evaluation

0 likes · 8 min read

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

Alibaba Cloud Developer

Nov 16, 2018 · Artificial Intelligence

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

This article traces the ten‑year evolution of Alibaba’s e‑commerce search system, detailing four major stages—from the early Pora streaming engine to dual‑link real‑time architectures, the integration of deep and reinforcement learning, and the shift to large‑scale online deep learning—while highlighting the technical drivers and future AI‑enabled search vision.

Machine LearningOnline LearningSearch

0 likes · 16 min read

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

Meituan Technology Team

Nov 15, 2018 · Artificial Intelligence

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Meituan enhanced its homepage “Guess You Like” recommendation slot by modeling user‑item interactions as a Markov Decision Process and applying an improved DDPG reinforcement‑learning agent that adjusts the ranking trade‑off parameter, uses advantage‑based Q decomposition, shares actor‑critic weights, and runs in a real‑time TensorFlow pipeline, delivering consistent lifts in click‑through, dwell time, and depth.

DDPGMDP ModelingOnline Learning

0 likes · 21 min read

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Tencent Cloud Developer

Oct 18, 2018 · Artificial Intelligence

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

This article outlines the essential role of a data scientist and introduces ten fundamental machine‑learning algorithms—including PCA/SVD, OLS and polynomial regression, regularized linear models, K‑Means, logistic regression, SVM, feed‑forward, convolutional and recurrent neural networks, CRFs, ensemble trees, and reinforcement‑learning methods—while linking to popular Python libraries and tutorials.

AlgorithmsDecision TreesPCA

0 likes · 10 min read

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

Sohu Tech Products

Oct 10, 2018 · Artificial Intelligence

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

This article explains how reinforcement learning, specifically the DDPG algorithm combined with Transformer-based networks, is applied to improve large‑scale news recall systems, detailing the business scenario, algorithm selection, model architecture, speed optimizations, training challenges, and observed online performance gains.

AIDDPGTransformer

0 likes · 13 min read

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

DataFunTalk

Sep 27, 2018 · Artificial Intelligence

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

The article presents a comprehensive overview of how Shumei Technology leverages machine learning—including supervised, unsupervised, and reinforcement learning methods—across its credit scoring, fraud detection, advertising, and audio content moderation services, highlighting practical challenges, model fusion techniques, and future research directions.

Model Fusionreinforcement learning

0 likes · 12 min read

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

JD Tech

Sep 12, 2018 · Artificial Intelligence

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

The article details JD's third‑generation autonomous delivery robots, covering their multi‑sensor fusion localization, deep‑learning perception, reinforcement‑learning motion control, extensive patent portfolio, and upcoming technical hurdles such as high‑precision mapping and lidar cost, while also inviting public voting for patent awards.

AI navigationJD Logisticsautonomous robots

0 likes · 8 min read

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

Sohu Tech Products

Sep 5, 2018 · Artificial Intelligence

Reinforcement Learning Theory Overview and Its Application to News Recommendation

This article reviews reinforcement learning fundamentals, contrasts it with supervised learning, surveys major RL algorithms such as DDPG and DQN, and details how these methods can be modeled for sequential news recommendation, including system architecture, state‑action definitions, and practical challenges.

AIDDPGDQN

0 likes · 15 min read

Reinforcement Learning Theory Overview and Its Application to News Recommendation

Alibaba Cloud Developer

Jul 13, 2018 · Artificial Intelligence

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

This article analyzes OpenAI’s Retro Contest on Sonic the Hedgehog, explains why reinforcement learning generalization is crucial for AGI, and details the winning team’s joint PPO pipeline, engineering optimizations, training strategies, and final performance compared to human baselines.

OpenAI Retro ContestRL generalizationSonic game

0 likes · 21 min read

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

WeChat Backend Team

May 11, 2018 · Artificial Intelligence

How PhoenixGo Turned AlphaGo Zero into a Champion AI Using Cloud Resources

PhoenixGo, an open‑source Go AI built on AlphaGo Zero's reinforcement‑learning algorithm, leveraged Tencent's idle cloud servers to achieve professional‑level play, won the 2018 World AI Go Championship, and was released with a strong model for researchers and hobbyists alike.

AICloud ComputingGo

0 likes · 4 min read

How PhoenixGo Turned AlphaGo Zero into a Champion AI Using Cloud Resources

Alibaba Cloud Developer

Apr 23, 2018 · Fundamentals

Top Technical Books Recommended by Alibaba Experts for World Book Day

On World Book Day, nine Alibaba technology veterans share a curated list of essential technical books—covering software testing, design patterns, AI, machine learning, reinforcement learning, Rust, and database architecture—offering concise reasons why each title is valuable for developers and engineers.

Database ArchitectureDesign PatternsMachine Learning

0 likes · 10 min read

Top Technical Books Recommended by Alibaba Experts for World Book Day

Tencent Cloud Developer

Mar 15, 2018 · Artificial Intelligence

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

The article surveys three recent approaches—Transition State Clustering, Sequential Windowed Inverse Reinforcement Learning, and Deep Discovery of Continuous Options—that automatically segment long‑horizon surgical‑robot demonstrations into sub‑tasks, learn hierarchical policies from limited data, and achieve markedly higher success rates on da Vinci cutting, tension, and needle‑picking tasks.

hierarchical learningimitation learningreinforcement learning

0 likes · 18 min read

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

Alibaba Cloud Developer

Feb 5, 2018 · Artificial Intelligence

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

In 2017 Alibaba's AliMe chatbot platform expanded from a single‑company solution to a multilingual, multi‑channel AI service, introducing platform‑level SaaS/PaaS capabilities, a seven‑layer front‑end architecture, modular back‑end design, advanced intent recognition, knowledge‑graph‑driven product management, reinforcement‑learning‑based recommendation, and machine‑reading comprehension for enterprise and consumer use cases.

AI PlatformAlibabaChatbot

0 likes · 23 min read

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

High Availability Architecture

Dec 11, 2017 · Artificial Intelligence

A Brief History of Computer Chess and Its Role in Artificial Intelligence

This article traces the evolution of computer chess from the 18th‑century automaton "The Turk" through early programs by Turing, Shannon, and McCarthy, to landmark systems like Deep Blue, AlphaGo, and AlphaZero, highlighting key algorithms, milestones, and their impact on AI research.

AI historyAlphaZeroDeep Blue

0 likes · 19 min read

A Brief History of Computer Chess and Its Role in Artificial Intelligence

Hulu Beijing

Dec 6, 2017 · Artificial Intelligence

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

This article explains how deep reinforcement learning, built upon traditional Q‑learning and enhanced with techniques like experience replay, enables agents to play Atari video games directly from raw pixel inputs, illustrating the key differences, processing steps, and the significance of this breakthrough in AI.

AtariQ-Learningdeep Q‑learning

0 likes · 5 min read

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

Hulu Beijing

Dec 5, 2017 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, describing its origins, key components such as agents, environments, states, actions, and rewards, explaining the Markov decision process framework, and highlighting common algorithms like Q‑learning, policy gradients, and actor‑critic methods.

AIAlgorithmsMDP

0 likes · 4 min read

What Is Reinforcement Learning? Core Concepts Explained

Ctrip Technology

Oct 19, 2017 · Artificial Intelligence

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

This article presents a comprehensive overview of Alibaba’s intelligent chatbot “Ali Xiaomi”, covering industry context, e‑commerce deployment, NLU architecture, intent‑matching layers, deep‑learning‑based intent classification, reinforcement‑learning‑driven recommendation, knowledge‑graph‑enhanced services, and hybrid retrieval‑generation dialogue models, with future outlooks for AI‑driven interaction.

deep learninge-commerceknowledge graph

0 likes · 18 min read

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

ITPUB

Sep 14, 2017 · Artificial Intelligence

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

Salesforce’s recent research introduces Seq2SQL, a reinforcement‑learning‑driven sequence‑to‑sequence model that translates natural‑language questions into SQL queries, eliminating the need to learn SQL, and includes the large WikiSQL dataset built from crowdsourced NL‑SQL pairs for training and evaluation.

AISQL GenerationSeq2SQL

0 likes · 6 min read

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

AntTech

Aug 4, 2017 · Artificial Intelligence

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

The article summarizes Dr. Qi Yuan’s presentation at CCAI 2017, detailing Ant Financial’s AI‑driven solutions for financial services—including risk control, intelligent assistants, large‑scale machine learning, reinforcement‑learning marketing, a model‑service platform, and a computer‑vision damage‑assessment system—while highlighting technical challenges, platform architecture, and the company’s open‑tech philosophy.

Artificial IntelligenceFinTechreinforcement learning

0 likes · 16 min read

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

Alibaba Cloud Developer

Jul 13, 2017 · Artificial Intelligence

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

This article explains how Alibaba's Cainiao network built the STARK VRP engine—an AI‑driven, distributed vehicle‑routing solver that supports dozens of VRP variants, leverages metaheuristics, parallel island models, and deep reinforcement learning to dramatically reduce fleet size and travel distance in Chinese logistics.

AILogistics OptimizationMetaheuristics

0 likes · 8 min read

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

21CTO

Jun 29, 2017 · Artificial Intelligence

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

The article explores how machine learning models emulate human learning by converting diverse real‑world descriptions into numerical features, illustrating concepts such as one‑hot encoding, supervised, unsupervised, and reinforcement learning, and emphasizing the importance of mapping inputs to outputs for intelligent systems.

AI conceptsMachine Learningfeatures

0 likes · 14 min read

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

Qunar Tech Salon

Apr 27, 2017 · Artificial Intelligence

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

The paper introduces LSTM‑Jump, a reinforcement‑learning‑trained LSTM variant that can dynamically skip irrelevant tokens, achieving up to six‑fold speed‑ups over standard sequential LSTMs while maintaining or improving accuracy on various NLP tasks such as sentiment analysis, document classification, and question answering.

LSTMNLPSequence Modeling

0 likes · 7 min read

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

21CTO

Apr 19, 2017 · Artificial Intelligence

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba’s e‑commerce search engine evolved over three years from offline batch models to a sophisticated AI-driven system that integrates real‑time feature ingestion, online learning, deep and reinforcement learning, enabling dynamic personalization and decision‑making that boosts conversion during high‑traffic events like Double 11.

AIOnline LearningReal‑Time Computing

0 likes · 15 min read

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba Cloud Developer

Mar 22, 2017 · Artificial Intelligence

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform

StarCraft, a classic real‑time strategy game, has become a premier testbed for deep reinforcement learning and AI research, and Alibaba’s open‑source Gym StarCraft platform now bridges Python, TensorFlow, Keras and OpenAI Gym to simplify multi‑agent, macro‑strategy development and fair benchmarking.

AlibabaOpenAI GymPython

0 likes · 3 min read

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform

Architect

Mar 10, 2016 · Artificial Intelligence

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

This article explains Monte Carlo Tree Search (MCTS), covering its origin in AlphaGo, fundamental algorithm steps, node‑selection strategies such as UCB, strengths and weaknesses, enhancements, historical background, and recent research developments in artificial intelligence.

Artificial IntelligenceMCTSMonte Carlo Tree Search

0 likes · 12 min read

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

dbaplus Community

Mar 9, 2016 · Artificial Intelligence

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

This article breaks down AlphaGo’s breakthrough architecture—four specialized neural‑network modules, Monte‑Carlo Tree Search, and deep reinforcement learning—to explain how the system moved from imitation learning to self‑improvement and ultimately defeated top human Go players.

AlphaGoGo AIMonte Carlo Tree Search

0 likes · 15 min read

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

Architects Research Society

Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Generative Modelsbayesian inferencemachine translation

0 likes · 4 min read

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

Baidu Tech Salon

Sep 22, 2014 · Artificial Intelligence

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms

After decades of being deemed a 'century‑long' AI challenge, Baidu’s Bingo system achieved amateur‑to‑professional level Go play by introducing optimized Monte‑Carlo tree search, a weakened Alpha‑Beta hybrid, and massive supervised learning, demonstrating how breakthroughs in game AI can ripple into broader Baidu products.

Artificial IntelligenceBaiduGo AI

0 likes · 8 min read

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms