Tagged articles

52 articles

Page 1 of 1

Apr 17, 2026 · Artificial Intelligence

Combining Transformers and RNNs: Google’s Memory Caching Unlocks Ultra‑Long Context

Google Research introduces Memory Caching (MC), a technique that gives RNNs growing memory capacity, bridging the gap with Transformers to enable ultra‑long context processing while reducing memory demands, and demonstrates its effectiveness through extensive language‑modeling and recall experiments.

AI ArchitectureGoogle ResearchMemory Caching

0 likes · 7 min read

Combining Transformers and RNNs: Google’s Memory Caching Unlocks Ultra‑Long Context

AI Cyberspace

Feb 11, 2026 · Artificial Intelligence

From RNNs to LSTMs and GRUs: A Hands‑On Guide to Sequence Modeling in PyTorch

This tutorial explains the nature of sequential data, why traditional feed‑forward networks struggle with it, and how recurrent architectures such as RNN, LSTM, and GRU capture temporal dependencies, complete with mathematical foundations, training algorithms, and full PyTorch implementations for sentiment analysis, text generation, and encoder‑decoder models.

Encoder-DecoderGRULSTM

0 likes · 57 min read

From RNNs to LSTMs and GRUs: A Hands‑On Guide to Sequence Modeling in PyTorch

AI Cyberspace

Jan 13, 2026 · Artificial Intelligence

From Symbolic AI to LLMs: A Complete NLP History and Model Guide

This article provides a comprehensive overview of natural language processing, tracing its evolution from early symbolic and statistical stages through deep learning breakthroughs, detailing sequence models, key NLP tasks, text representation methods, and the development of modern architectures like RNN, LSTM, GRU, Transformer, and GPT series.

GPTLSTMNLP

0 likes · 60 min read

From Symbolic AI to LLMs: A Complete NLP History and Model Guide

Tencent Cloud Developer

Nov 4, 2025 · Artificial Intelligence

From Functions to Transformers: Mastering Neural Networks Step by Step

This article walks you through the evolution from basic mathematical functions to modern large‑scale models, explaining activation functions, forward and backward propagation, loss calculation, gradient descent, regularization, dropout, word embeddings, RNNs, and the core mechanics of the Transformer architecture.

RNNRegularizationTransformer

0 likes · 15 min read

From Functions to Transformers: Mastering Neural Networks Step by Step

Data Party THU

Sep 16, 2025 · Big Data

How Big Data Transforms Petrochemical Price Forecasting: A Student Project Review

This report details a university big‑data project that built a full pipeline—from raw petrochemical market data and text mining to variable selection, XGBoost/LightGBM/Lasoo regression and RNN/LSTM/GRU models—to predict product prices across multiple horizons, evaluate errors, and deliver an interactive demo.

Big DataLSTMRNN

0 likes · 8 min read

How Big Data Transforms Petrochemical Price Forecasting: A Student Project Review

Architects Research Society

Sep 4, 2025 · Artificial Intelligence

Choosing the Right Generative AI Model: Transformers, Diffusion, GANs & RNNs Explained

This article outlines the four dominant generative AI architectures—Transformers, diffusion models, GANs, and RNNs—explaining their core mechanisms, key capabilities, and typical application domains such as chatbots, image creation, deep‑fake media, and time‑series analysis, helping readers choose the right model for their needs.

AI applicationsGaNRNN

0 likes · 3 min read

Choosing the Right Generative AI Model: Transformers, Diffusion, GANs & RNNs Explained

Qborfy AI

Aug 7, 2025 · Artificial Intelligence

Understanding RNNs: From Memory Cells to Real‑World Applications

This article explains how recurrent neural networks (RNNs) add memory to neural models, details the gate mechanisms of LSTM and GRU, compares their structures and parameter counts, and illustrates their use in speech recognition, translation, stock prediction, and video generation, while highlighting practical insights and energy considerations.

AIGRULSTM

0 likes · 5 min read

Understanding RNNs: From Memory Cells to Real‑World Applications

Alibaba Cloud Developer

Aug 6, 2025 · Artificial Intelligence

How Transformers Revolutionize Sequence Modeling: From RNN Limits to Self‑Attention Mastery

This article explains why Transformer models surpass traditional RNN‑based seq2seq architectures by introducing self‑attention, multi‑head attention, and positional encoding, detailing the inner workings of encoders, decoders, and attention mechanisms, and comparing their advantages and limitations across NLP and vision tasks.

GRULSTMRNN

0 likes · 30 min read

How Transformers Revolutionize Sequence Modeling: From RNN Limits to Self‑Attention Mastery

AI Algorithm Path

Apr 5, 2024 · Artificial Intelligence

Master CNN, RNN, GAN, and Transformer Architectures in One Guide

This article provides a friendly, step‑by‑step overview of five core deep‑learning architectures—CNN, RNN, GAN, Transformers, and encoder‑decoder—explaining their structures, key components, and typical use cases in image and natural‑language processing.

CNNEncoder-DecoderGaN

0 likes · 12 min read

Master CNN, RNN, GAN, and Transformer Architectures in One Guide

Rare Earth Juejin Tech Community

Nov 12, 2023 · Artificial Intelligence

A Comprehensive Introduction to RNN, LSTM, Attention Mechanisms, and Transformers for Large Language Models

This article provides a thorough overview of large language models, explaining the relationship between NLP and LLMs, the evolution from RNN to LSTM, the fundamentals of attention mechanisms, and the architecture and operation of Transformer models, all illustrated with clear examples and diagrams.

Artificial IntelligenceLSTMNLP

0 likes · 25 min read

A Comprehensive Introduction to RNN, LSTM, Attention Mechanisms, and Transformers for Large Language Models

DataFunSummit

Nov 11, 2023 · Artificial Intelligence

RWKV: Next‑Generation Heterogeneous Large Model – Design, Evolution, Performance, and Training Strategies

This article presents a comprehensive overview of the RWKV large language model, covering its origin, attention‑free RNN architecture, performance benchmarks, evolution through v4 and v5, training pipelines, diverse application cases, open‑source ecosystem, and a detailed Q&A session.

AILarge Language ModelRNN

0 likes · 18 min read

RWKV: Next‑Generation Heterogeneous Large Model – Design, Evolution, Performance, and Training Strategies

Rare Earth Juejin Tech Community

Oct 19, 2023 · Artificial Intelligence

NLP Basics: Word Embeddings, Word2Vec, and Hand‑crafted RNN Implementation in PyTorch

This article introduces word‑level representations—from one‑hot encoding to dense word embeddings via Word2Vec—explains cosine similarity, then walks through the structure, limitations, and PyTorch implementation of a vanilla RNN, including a custom forward function and verification against the library API.

Cosine SimilarityNLPPyTorch

0 likes · 19 min read

NLP Basics: Word Embeddings, Word2Vec, and Hand‑crafted RNN Implementation in PyTorch

Efficient Ops

Sep 12, 2023 · Artificial Intelligence

AI-Powered Text Clustering and RNNs Automate Test Environment Issue Diagnosis

This article describes how a Chinese bank’s software development team leveraged AI techniques—text clustering and recurrent neural networks—to automatically classify and diagnose test-environment problems, dramatically reducing manual effort, improving issue visibility, and enabling self-healing mechanisms for faster, more reliable software delivery.

AIRNNissue classification

0 likes · 5 min read

AI-Powered Text Clustering and RNNs Automate Test Environment Issue Diagnosis

Model Perspective

Aug 1, 2023 · Artificial Intelligence

Mastering LSTM: How Long Short-Term Memory Networks Capture Long-Term Dependencies

This article explains the challenges of processing sequential data, introduces LSTM as a solution to long‑term dependency problems in RNNs, details its cell state and gate mechanisms, showcases its architecture, and provides Python code examples for time‑series forecasting using Keras.

KerasLSTMPython

0 likes · 9 min read

Mastering LSTM: How Long Short-Term Memory Networks Capture Long-Term Dependencies

Rare Earth Juejin Tech Community

Jul 31, 2023 · Artificial Intelligence

Overview of Deep Neural Network Architectures

This article provides a comprehensive overview of deep neural network families, introducing twelve major architectures—including Feedforward, CNN, RNN, LSTM, DBN, GAN, Autoencoder, Residual, Capsule, Transformer, Attention, and Deep Reinforcement Learning—explaining their principles, structures, training methods, and offering Python/TensorFlow/PyTorch code examples.

CNNGaNPython

0 likes · 29 min read

Overview of Deep Neural Network Architectures

DataFunTalk

Apr 3, 2023 · Artificial Intelligence

Implementing RNN, LSTM, and GRU with PyTorch

This article introduces the basic architectures of recurrent neural networks (RNN), LSTM, and GRU, explains PyTorch APIs such as nn.RNN, nn.LSTM, nn.GRU, details their parameters, demonstrates code examples for building and testing these models, and provides practical insights for deep learning practitioners.

GRULSTMPyTorch

0 likes · 9 min read

Implementing RNN, LSTM, and GRU with PyTorch

Model Perspective

Mar 2, 2023 · Artificial Intelligence

Understanding RNNs and LSTM: Theory and Python Keras Implementation

This article explains the fundamentals of Recurrent Neural Networks and Long Short‑Term Memory units, their gating mechanisms, and demonstrates a practical Python Keras example that predicts future PM2.5 concentrations using an LSTM model.

KerasLSTMPython

0 likes · 7 min read

Understanding RNNs and LSTM: Theory and Python Keras Implementation

DataFunSummit

Feb 1, 2023 · Artificial Intelligence

Clustering-Based Global LSTM Models for Large-Scale Time Series Forecasting

The paper proposes clustering thousands of related time series and training separate global LSTM models for each cluster, showing that this reduces heterogeneity, leverages shared information, and improves forecasting accuracy compared to individual models, with extensive experiments on CIF2016 and NN5 datasets.

LSTMMachine LearningRNN

0 likes · 33 min read

Clustering-Based Global LSTM Models for Large-Scale Time Series Forecasting

Model Perspective

Jan 12, 2023 · Artificial Intelligence

Neural Networks Explained: Architecture, Training, and Reinforcement Basics

This article introduces neural networks, covering their layered structure, common types like CNNs and RNNs, key components such as activation functions, loss, learning rate, backpropagation, dropout, batch normalization, and extends to reinforcement learning concepts including MDPs, policies, value functions, and Q‑learning.

CNNMachine LearningRNN

0 likes · 6 min read

Neural Networks Explained: Architecture, Training, and Reinforcement Basics

Model Perspective

Oct 6, 2022 · Artificial Intelligence

Demystifying RNNs and LSTMs: Architecture, Limits, and Python Forecasting

This article explains the structure and operation of recurrent neural networks (RNNs), their limitations, how long short‑term memory (LSTM) networks overcome these issues with gated mechanisms, and provides a complete Python implementation for time‑series airline passenger forecasting.

LSTMPythonRNN

0 likes · 17 min read

Demystifying RNNs and LSTMs: Architecture, Limits, and Python Forecasting

Model Perspective

Aug 15, 2022 · Artificial Intelligence

Understanding Recurrent Neural Networks: From Vanilla RNN to LSTM with Keras

This article introduces recurrent neural networks (RNNs) and their ability to handle sequential data, explains the limitations of vanilla RNNs, presents the LSTM architecture with its gates, and provides complete Keras code for data loading, model building, and training both vanilla RNN and LSTM models.

KerasLSTMRNN

0 likes · 5 min read

Understanding Recurrent Neural Networks: From Vanilla RNN to LSTM with Keras

DataFunSummit

May 16, 2022 · Artificial Intelligence

Reinforcement Learning for E‑commerce Search Ranking: RNN User State Modeling and DDPG Long‑Term Value Optimization

This presentation details how JD applied reinforcement learning—using RNN‑based user state modeling and a DDPG framework—to improve e‑commerce search ranking by optimizing long‑term cumulative value, showing significant offline and online gains in conversion and GMV.

DDPGRNNe-commerce

0 likes · 20 min read

Reinforcement Learning for E‑commerce Search Ranking: RNN User State Modeling and DDPG Long‑Term Value Optimization

DataFunSummit

Nov 21, 2021 · Artificial Intelligence

Sequential Recommendation Algorithms: Overview and Techniques

This article surveys sequential recommendation methods, covering standard models such as pooling, RNN, CNN, attention, and Transformer, as well as long‑short term, multi‑interest, multi‑behavior approaches, and recent advances like contrastive learning, highlighting their impact on recommendation performance.

Machine LearningRNNTransformer

0 likes · 8 min read

Sequential Recommendation Algorithms: Overview and Techniques

DataFunSummit

Dec 27, 2020 · Artificial Intelligence

Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation

This article provides a comprehensive overview of sequence labeling tasks in NLP, covering their definition, common tag schemes (BIO, BIEO, BIESO), comparisons with other NLP tasks, major modeling approaches such as HMM, CRF, RNN and BERT, real‑world applications like POS tagging, NER, event extraction and gene analysis, and a step‑by‑step PyTorch implementation with dataset preparation, training pipeline, and evaluation metrics.

BERTCRFHMM

0 likes · 27 min read

Sequence Labeling in Natural Language Processing: Definitions, Tag Schemes, Model Choices, and Practical Implementation

Sohu Tech Products

Nov 18, 2020 · Artificial Intelligence

Understanding Sequence‑to‑Sequence (seq2seq) Models and Attention Mechanisms

This article explains the fundamentals of seq2seq neural machine translation models, covering encoder‑decoder architecture, word embeddings, context vectors, RNN processing, and the attention mechanism introduced by Bahdanau and Luong, with visual illustrations and reference links for deeper study.

EmbeddingNeural Machine TranslationRNN

0 likes · 11 min read

Understanding Sequence‑to‑Sequence (seq2seq) Models and Attention Mechanisms

Architects' Tech Alliance

Sep 3, 2020 · Artificial Intelligence

Deep Learning Specialization Infographic Overview

This article presents a comprehensive English summary of the deep learning specialization infographics originally shared by Andrew Ng, covering fundamentals, logistic regression, shallow and deep neural networks, regularization, optimization, hyperparameters, convolutional and recurrent networks, and practical advice for model building and evaluation.

CNNOptimizationRNN

0 likes · 21 min read

Deep Learning Specialization Infographic Overview

DataFunTalk

Jun 22, 2020 · Artificial Intelligence

Ctrip's Automated Iterative Anti‑Fraud Modeling Framework for Payment Risk

The article describes Ctrip's payment fraud risk characteristics, a comprehensive automated iterative anti‑fraud model framework—including variable system, GAN‑augmented sample generation, RNN behavior encoding, and tree‑based classifiers—and demonstrates how this approach restores recall performance compared with traditional static models.

GaNMachine LearningRNN

0 likes · 12 min read

Ctrip's Automated Iterative Anti‑Fraud Modeling Framework for Payment Risk

DataFunTalk

Jun 13, 2020 · Artificial Intelligence

Deep Learning for Expired POI Detection at Amap: Feature Engineering, RNN, Wide&Deep, and Attention‑TCN

This article details how Amap leverages deep‑learning techniques—including temporal and auxiliary feature engineering, multi‑stage RNN models, Wide&Deep architectures, and an Attention‑TCN approach—to accurately identify and handle expired points of interest, improving map freshness and user experience.

POI expirationRNNTCN

0 likes · 13 min read

Deep Learning for Expired POI Detection at Amap: Feature Engineering, RNN, Wide&Deep, and Attention‑TCN

Amap Tech

May 8, 2020 · Artificial Intelligence

Expired POI Detection in Amap Using Deep Learning: Feature Engineering, RNN, Wide&Deep, and TCN Models

The project develops a deep‑learning pipeline for Amap’s expired POI detection that integrates two‑year temporal trend features, industry and verification attributes, a variable‑length LSTM, a Wide‑Deep architecture, and a Wide‑Attention Temporal Convolutional Network, achieving higher accuracy and efficiency while outlining future macro‑and micro‑level enhancements.

POI expirationRNNTCN

0 likes · 15 min read

Expired POI Detection in Amap Using Deep Learning: Feature Engineering, RNN, Wide&Deep, and TCN Models

TAL Education Technology

May 1, 2020 · Artificial Intelligence

Deep Knowledge Tracing: Concepts, Model Architecture, Applications, and Future Outlook

This article explains knowledge tracing and its deep learning variant DKT, detailing the underlying RNN/LSTM models, data encoding, loss functions, experimental results, practical applications in adaptive learning, as well as advantages, limitations, and future research directions.

LSTMRNNadaptive learning

0 likes · 11 min read

Deep Knowledge Tracing: Concepts, Model Architecture, Applications, and Future Outlook

DataFunTalk

Dec 16, 2019 · Artificial Intelligence

A Comprehensive Overview of Sequential Recommendation Models and Techniques

This article provides an in-depth overview of sequential recommendation, defining the problem, discussing data preparation, and reviewing various neural architectures—including MLP, CNN, RNN, Temporal CNN, self‑attention, and reinforcement‑learning approaches—while offering practical guidance on model selection and implementation.

CNNRNNSequential Modeling

0 likes · 36 min read

A Comprehensive Overview of Sequential Recommendation Models and Techniques

DataFunTalk

Dec 13, 2019 · Artificial Intelligence

Fundamentals of Deep Learning: Neural Networks, CNNs, RNNs, LSTM, and GRU

This article provides a comprehensive overview of deep learning fundamentals, covering neural network basics, forward and backward feedback architectures, key models such as MLP, CNN, RNN, LSTM and GRU, training techniques like gradient descent, learning rate schedules, momentum, weight decay, and batch normalization.

CNNGRULSTM

0 likes · 14 min read

Fundamentals of Deep Learning: Neural Networks, CNNs, RNNs, LSTM, and GRU

Alibaba Cloud Developer

Nov 21, 2019 · Artificial Intelligence

How Alibaba’s ‘Guess‑Draw Treasure’ Game Powers Real‑Time Sketch AI

During the 2023 Lunar New Year, Taobao Live launched the real‑time interactive game ‘Guess‑Draw Treasure’, which lets users sketch on mobile devices and have AI instantly recognize their drawings to win cash rewards; this article reveals the underlying AI techniques, challenges, model choices, datasets, and future plans.

AI sketch recognitionAlibabaCNN

0 likes · 13 min read

How Alibaba’s ‘Guess‑Draw Treasure’ Game Powers Real‑Time Sketch AI

Alibaba Cloud Developer

May 27, 2019 · Artificial Intelligence

From Neurons to BERT: Tracing the Evolution of Deep Learning in NLP

This article walks through the development of deep learning for natural language processing, starting with basic neural cells and shallow networks, then exploring CNNs, RNNs, LSTMs, TextCNN, ESIM, ELMo, and culminating with the Transformer‑based BERT model, its training objectives, fine‑tuning strategies, and performance comparisons.

BERTCNNNLP

0 likes · 19 min read

From Neurons to BERT: Tracing the Evolution of Deep Learning in NLP

Hulu Beijing

Apr 16, 2019 · Artificial Intelligence

How Deep Learning Transforms Network Bandwidth Prediction: From RNN to CNN‑RNN Hybrids

This article explores how deep learning techniques such as RNN, LSTM, 3D‑CNN, and CNN‑RNN hybrids can be applied to predict network bandwidth and traffic, comparing traditional time‑series methods with modern AI approaches and highlighting the potential of graph neural networks for future improvements.

CNNNetwork TrafficRNN

0 likes · 9 min read

How Deep Learning Transforms Network Bandwidth Prediction: From RNN to CNN‑RNN Hybrids

360 Quality & Efficiency

Dec 21, 2018 · Artificial Intelligence

Machine Learning-Based Test Case Step Recommendation: Data Preprocessing, N‑gram, CBOW, and RNN/LSTM Model Construction

This article explains how to use machine‑learning techniques—including data preprocessing, N‑gram, CBOW, and various RNN/LSTM models—to automatically recommend the next function in a test‑case step sequence, improving writing speed and efficiency for developers.

CBOWLSTMMachine Learning

0 likes · 4 min read

Machine Learning-Based Test Case Step Recommendation: Data Preprocessing, N‑gram, CBOW, and RNN/LSTM Model Construction

Alibaba Cloud Developer

Oct 30, 2018 · Artificial Intelligence

How Advanced LSTM (A‑LSTM) Boosts Speech Emotion Recognition by 5.5%

This article introduces Advanced LSTM (A‑LSTM), which linearly combines multiple past hidden states to overcome traditional LSTM's one‑step dependency, and demonstrates its application in utterance‑level speech emotion recognition, achieving a 5.5% accuracy improvement through attention‑based weighted‑pooling RNNs and auxiliary speaker and gender tasks.

A-LSTMLSTMRNN

0 likes · 8 min read

How Advanced LSTM (A‑LSTM) Boosts Speech Emotion Recognition by 5.5%

Tencent Cloud Developer

Aug 3, 2018 · Artificial Intelligence

Analysis of Google Quickdraw CNN‑RNN Model for Sketch Recognition

The article dissects Google’s Quickdraw sketch‑recognition model, detailing its 1‑D convolutional front‑end, Bi‑LSTM encoder, and softmax classifier, explaining the TFRecord‑based normalization and interpolation steps, why pooling harms accuracy, and how the massive dataset can fuel diverse sequential‑learning applications and product concepts.

CNNMachine LearningRNN

0 likes · 7 min read

Analysis of Google Quickdraw CNN‑RNN Model for Sketch Recognition

Alibaba Cloud Developer

Jul 11, 2018 · Artificial Intelligence

Can Global Ranking Boost E‑Commerce GMV? A New AI Approach

Traditional e‑commerce ranking ignores interactions among displayed items, but this study introduces a novel global ranking method that models mutual influences, optimizes expected GMV using extended global features and RNN‑based sequence generation, achieving a 5% GMV lift in large‑scale A/B tests.

GMVMachine LearningRNN

0 likes · 12 min read

Can Global Ranking Boost E‑Commerce GMV? A New AI Approach

Alibaba Cloud Developer

Jul 9, 2018 · Artificial Intelligence

How JUMP Boosts Session Click‑Through and Dwell Time with a Triple‑Layer RNN

The paper introduces JUMP, a novel three‑layer RNN architecture that simultaneously predicts click‑through rates and user dwell time in session‑based recommendation scenarios, leveraging a fast‑slow layer to handle short sessions, an attention layer to filter noise, and survival‑analysis‑based modeling of stay duration, achieving superior performance across multiple benchmark datasets.

RNNclick-through ratedwell time

0 likes · 7 min read

How JUMP Boosts Session Click‑Through and Dwell Time with a Triple‑Layer RNN

Alibaba Cloud Developer

May 11, 2018 · Artificial Intelligence

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy

Researchers introduce a novel suffix‑prediction mechanism for neural machine translation that separately generates stems and suffixes during decoding, dramatically reducing out‑of‑vocabulary errors and morphological mistakes in English‑Russian translation, achieving consistent improvements across RNN and Transformer models on large‑scale news and e‑commerce datasets.

English-RussianMorphologically Rich LanguagesNeural Machine Translation

0 likes · 10 min read

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy

Baobao Algorithm Notes

Feb 28, 2018 · Artificial Intelligence

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

This article provides a comprehensive guide to text classification, covering traditional pipelines, bag‑of‑words and TF‑IDF features, dimensionality‑reduction techniques, word‑embedding models such as GloVe, word2vec and fastText, and modern deep‑learning architectures like CNN, RCNN and HAN.

CNNNLPRNN

0 likes · 9 min read

Mastering Text Classification: From TF‑IDF to Word Embeddings and Deep Learning

Architecture Digest

Feb 24, 2018 · Artificial Intelligence

Eight Neural Network Architectures Every Machine Learning Researcher Should Know

This article explains why machine learning is essential for complex tasks, defines neural networks, outlines three reasons to study them, and provides concise overviews of eight fundamental neural network architectures—including perceptron, CNN, RNN, LSTM, Hopfield, Boltzmann machines, deep belief networks, and deep autoencoders—grouped by their structural categories.

AI architecturesCNNMachine Learning

0 likes · 23 min read

Eight Neural Network Architectures Every Machine Learning Researcher Should Know

Hulu Beijing

Dec 20, 2017 · Artificial Intelligence

How Attention Mechanisms Transform Seq2Seq Models for Better Translation

This article explains why attention mechanisms were introduced into Seq2Seq models, how they address the limitations of fixed‑length encoding, the role of bidirectional RNNs, and showcases their impact on machine translation and image captioning with illustrative diagrams.

RNNSeq2Seqattention mechanism

0 likes · 10 min read

How Attention Mechanisms Transform Seq2Seq Models for Better Translation

Hulu Beijing

Dec 12, 2017 · Artificial Intelligence

How LSTM Achieves Long‑Term Memory: Gates, Activations & Variants Explained

This article explains how LSTM networks overcome RNN limitations by using input, forget, and output gates with sigmoid and tanh activations, describes the core update equations, discusses alternative activation functions and hard‑gate variants, and provides references for deeper study.

LSTMRNNSequence Modeling

0 likes · 10 min read

How LSTM Achieves Long‑Term Memory: Gates, Activations & Variants Explained

Hulu Beijing

Dec 7, 2017 · Artificial Intelligence

How Recurrent Neural Networks Generate Text Representations and Tackle Gradient Problems

This article explains what Recurrent Neural Networks are, how they create text representations, why they suffer from gradient vanishing or explosion, and what architectural improvements such as LSTM, GRU, residual connections and gradient clipping can do to overcome these issues.

Gradient ExplosionLSTMRNN

0 likes · 9 min read

How Recurrent Neural Networks Generate Text Representations and Tackle Gradient Problems

ITPUB

Nov 17, 2017 · Artificial Intelligence

How RNNs Power Risk Control in O2O Food Delivery: A TensorFlow Case Study

This article explains how Baidu Waimai's risk‑control team uses recurrent neural networks, especially LSTM, within TensorFlow to detect fraudulent merchants and users, compares static and dynamic RNN implementations, demonstrates a MNIST digit‑recognition example, and discusses optimization algorithms and model trade‑offs for real‑time fraud detection.

LSTMMNISTOptimization

0 likes · 27 min read

How RNNs Power Risk Control in O2O Food Delivery: A TensorFlow Case Study

Hujiang Technology

Oct 12, 2017 · Artificial Intelligence

An Overview of Machine Learning and Deep Learning: Definitions, Concepts, and Core Techniques

This article provides a comprehensive introduction to machine learning and deep learning, covering their definitions, classifications, key algorithms, neural network structures, core concepts such as generalization and regularization, and typical architectures like CNN and RNN, illustrated with numerous diagrams.

CNNMachine LearningRNN

0 likes · 22 min read

An Overview of Machine Learning and Deep Learning: Definitions, Concepts, and Core Techniques

Liulishuo Tech Team

Mar 25, 2017 · Artificial Intelligence

Building a Student Model with TensorFlow: Deep Knowledge Tracing for Adaptive Learning

This article reviews how Liulishuo applied TensorFlow to implement a Deep Knowledge Tracing (DKT) student model for an adaptive learning system, covering the problem background, model architecture, TensorFlow implementation details, multi‑GPU training, and practical deployment considerations.

Deep Knowledge TracingRNNStudent Modeling

0 likes · 12 min read

Building a Student Model with TensorFlow: Deep Knowledge Tracing for Adaptive Learning

dbaplus Community

Jan 19, 2017 · Artificial Intelligence

Mastering LSTM: Architecture, Forward/Backward Computation, and Implementation

This article provides a comprehensive guide to Long Short-Term Memory networks, covering their motivation, detailed forward and backward equations, gate mechanisms, training algorithm, gradient checking, and a full C++ implementation, while also introducing the simpler GRU variant.

BackpropagationGRULSTM

0 likes · 12 min read

Mastering LSTM: Architecture, Forward/Backward Computation, and Implementation

dbaplus Community

Nov 10, 2016 · Artificial Intelligence

Demystifying Recurrent Neural Networks: Theory, Training, and Implementation

This article explains the fundamentals of recurrent neural networks (RNNs), their role in language modeling, various RNN architectures such as bidirectional and deep RNNs, the back‑propagation through time (BPTT) training algorithm, gradient challenges, vectorization techniques, and provides a step‑by‑step code implementation.

BPTTRNNRecurrent Neural Network

0 likes · 21 min read

Demystifying Recurrent Neural Networks: Theory, Training, and Implementation

Qunar Tech Salon

Nov 29, 2015 · Artificial Intelligence

From Symbolic Semantics to Vector Representations: Deep Learning for Natural Language Understanding

The article reviews symbolic knowledge bases such as WordNet, ConceptNet and FrameNet, explains how deep learning replaces them with vector‑based semantic representations, and discusses encoder‑decoder RNNs, attention mechanisms, and future directions for truly understanding language through experiential learning.

RNNattention mechanismdeep learning

0 likes · 12 min read

From Symbolic Semantics to Vector Representations: Deep Learning for Natural Language Understanding