Tagged articles

48 articles

Page 1 of 1

May 18, 2026 · Artificial Intelligence

ByteDance Teams with He Kaiming to Open‑Source the Continuous Diffusion Language Model Cola DLM

The article analyzes ByteDance's Cola DLM, a fully open‑source continuous diffusion language model that abandons token‑centric generation in favor of latent semantic representations, detailing its architecture, training strategy, scaling stability, and how it compares with the earlier ELF model.

ByteDanceCola DLMcontinuous diffusion

0 likes · 14 min read

ByteDance Teams with He Kaiming to Open‑Source the Continuous Diffusion Language Model Cola DLM

Data Party THU

Apr 29, 2026 · Artificial Intelligence

Claude Opus 4.7 System Prompt Leak: Decoding Its 10 Core Design Decisions

The article dissects the leaked Claude Opus 4.7 system prompt, revealing ten intertwined design decisions—from treating psychological reconstruction as a danger signal to dynamic safety‑policy upgrades—that together shape the model’s self‑restraint, tool‑use, memory handling, and risk‑aware behavior.

AI safetyClaudePrompt Engineering

0 likes · 8 min read

Claude Opus 4.7 System Prompt Leak: Decoding Its 10 Core Design Decisions

Smart Workplace Lab

Apr 23, 2026 · Artificial Intelligence

Think Standard Scripts Solve It? Uncover the Real Issue with High‑EQ AI Prompt Tuning

The article explains why using formal, standard language makes AI‑generated workplace messages sound robotic and presents a three‑step protocol—high‑quality phrase extraction, persona‑mapping prompts, and forbidden‑word rules—to feed the model with emotionally intelligent corpora for more natural communication.

AI prompt engineeringWorkplace AIcommunication tone

0 likes · 5 min read

Think Standard Scripts Solve It? Uncover the Real Issue with High‑EQ AI Prompt Tuning

Smart Workplace Lab

Apr 16, 2026 · Industry Insights

Boost AI Communication Trust: Empathy Prompt Templates & Risk Checklist

This guide explains why AI‑generated messages often feel robotic, presents a set of prompt templates that inject emotion, relationship, and cultural context into LLM outputs, and offers a risk‑assessment checklist to ensure safe, high‑impact workplace communication.

AIPrompt EngineeringRisk Assessment

0 likes · 6 min read

Boost AI Communication Trust: Empathy Prompt Templates & Risk Checklist

Machine Learning Algorithms & Natural Language Processing

Mar 7, 2026 · Artificial Intelligence

Transformer Hidden States Can Reconstruct Input with 100% Accuracy – New Invertibility Study

A recent paper from Sapienza University's GLADIA Lab shows that mainstream Transformer language models are injective, enabling a novel SIPIT algorithm to recover original text from hidden states with perfect accuracy, while extensive experiments confirm the models retain all input information.

InjectiveInvertibilitySIPIT

0 likes · 11 min read

Transformer Hidden States Can Reconstruct Input with 100% Accuracy – New Invertibility Study

Java Tech Enthusiast

Nov 30, 2025 · Artificial Intelligence

How a 500‑Million‑Parameter ChatGPT Clone Runs Inside Minecraft’s Redstone

A Minecraft developer built CraftGPT, a 5‑million‑parameter language model that runs entirely on Redstone circuits, demonstrating how the game’s Turing‑complete logic system can implement a transformer‑style AI with billions of in‑game blocks.

AIGame ComputingMinecraft

0 likes · 9 min read

How a 500‑Million‑Parameter ChatGPT Clone Runs Inside Minecraft’s Redstone

Efficient Ops

Aug 27, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

DeepSeek’s latest V3.1 model unexpectedly injects the Chinese character “极” into generated text, a token‑ID mix‑up that breaks code compilation, JSON parsing, and academic writing, with users tracing the issue to adjacent token IDs and two main hypotheses of dataset contamination or model shortcut.

AI safetyDeepSeeklanguage model

0 likes · 4 min read

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

Data Party THU

Aug 18, 2025 · Artificial Intelligence

Why Google’s Gemma 3 270M Model Is a Game‑Changer for Edge AI

Google’s newly released Gemma 3 270M is a compact 270‑million‑parameter language model that combines a large token vocabulary, energy‑efficient INT4 quantization, strong instruction‑following, and production‑ready checkpoints, making it ideal for fine‑tuning, on‑device deployment, and a wide range of low‑latency AI tasks.

Edge AIGemma 3Google AI

0 likes · 7 min read

Why Google’s Gemma 3 270M Model Is a Game‑Changer for Edge AI

Wu Shixiong's Large Model Academy

Jul 3, 2025 · Artificial Intelligence

Causal LM vs Prefix LM: Core Differences, Attention Masks, and Choosing the Right Model

This article explains the fundamental distinctions between Causal Language Models and Prefix Language Models, detailing their definitions, attention‑mask designs, underlying design philosophies, and practical scenarios where each architecture excels.

AIAttention MaskCausal LM

0 likes · 7 min read

Causal LM vs Prefix LM: Core Differences, Attention Masks, and Choosing the Right Model

JD Tech Talk

Mar 5, 2025 · Artificial Intelligence

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

GLM introduces a unified pretraining framework that combines autoregressive blank‑filling with 2D positional encoding and span‑shuffle, achieving superior performance over BERT, T5 and GPT on a range of NLU and generation tasks such as SuperGLUE, text‑filling, and language modeling.

2D positional encodingGLMNLU

0 likes · 27 min read

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

JD Cloud Developers

Mar 5, 2025 · Artificial Intelligence

How GLM’s Autoregressive Blank‑Filling Beats BERT, T5, and GPT

GLM introduces a universal language model that combines autoregressive blank‑filling with 2D positional encoding and span‑shuffle training, achieving superior performance over BERT, T5, and GPT across NLU, conditional and unconditional generation tasks, as demonstrated on SuperGLUE and other benchmarks.

NLUTransformerblank filling

0 likes · 29 min read

How GLM’s Autoregressive Blank‑Filling Beats BERT, T5, and GPT

AI Algorithm Path

Feb 20, 2025 · Artificial Intelligence

What Is Perplexity in Large Language Models?

The article explains perplexity as a metric for evaluating large language models, walks through a step‑by‑step probability calculation for a sample sentence, shows how to normalize by sentence length using the geometric mean, and demonstrates that lower perplexity indicates a more accurate and less uncertain model.

AIPerplexityevaluation

0 likes · 6 min read

What Is Perplexity in Large Language Models?

Code Mala Tang

Jan 31, 2025 · Artificial Intelligence

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

This guide presents seven practical prompt‑engineering techniques—clear goals, structured queries, domain terminology, concrete examples, scoped questions, step‑by‑step breakdowns, and multi‑turn interactions—to help users get more accurate and useful answers from DeepSeek.

AI promptsDeepSeekPrompt Engineering

0 likes · 6 min read

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

360 Tech Engineering

Dec 17, 2024 · Artificial Intelligence

Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting

The article introduces two 360 AI Research Institute projects—IAA, an architecture that equips frozen language models with multimodal capabilities via plug‑in layers, and BDM, a Chinese‑native diffusion model compatible with the Stable Diffusion ecosystem—detailing their motivations, designs, benchmark results, and open‑source resources.

Chinese AI paintingStable Diffusionlanguage model

0 likes · 6 min read

Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting

Infra Learning Club

Oct 30, 2024 · Artificial Intelligence

How GPT-3 Evolved: From Transformer Roots to Massive Language Models

The article traces the development of GPT series—from the 2017 Transformer breakthrough, through GPT‑1, GPT‑2, and GPT‑3’s 175 billion parameters, to later models like Codex and ChatGPT—highlighting key papers, architectural choices, and the surprising role of OpenAI’s decoder‑only approach.

GPT-3GoogleOpenAI

0 likes · 4 min read

How GPT-3 Evolved: From Transformer Roots to Massive Language Models

Baobao Algorithm Notes

May 21, 2024 · Artificial Intelligence

How to Pre‑train a 20M‑Parameter LLaMA‑3 Mini Model with Hugging Face Trainer

This step‑by‑step guide shows how to use Hugging Face's Trainer API to pre‑train an ultra‑small LLaMA‑3 model (under 20 M parameters) on the TinyStories dataset, covering model configuration, tokenizer setup, data preprocessing, collators, training arguments, and inference results.

Hugging FaceLLaMAPython

0 likes · 27 min read

How to Pre‑train a 20M‑Parameter LLaMA‑3 Mini Model with Hugging Face Trainer

Architect's Alchemy Furnace

Apr 27, 2024 · Artificial Intelligence

28 Powerful ChatGPT Prompt Techniques to Supercharge Your Work

This guide presents 28 practical ChatGPT prompt strategies—from role‑playing experts and setting response length to crafting resumes, weekly reports, and product PRDs—helping readers boost productivity, creativity, and learning across personal and professional tasks.

AI productivityChatGPTTips

0 likes · 33 min read

28 Powerful ChatGPT Prompt Techniques to Supercharge Your Work

21CTO

Feb 2, 2024 · Artificial Intelligence

WeChat’s App Bloats 1400×, China’s Quantum Computer Reaches 1M Users, AI2 Releases OLMo

Recent tech headlines reveal WeChat’s iOS app ballooning to 712 MB, China’s third‑generation superconducting quantum computer “Wukong” surpassing one million remote accesses, AI2 unveiling the open‑source OLMo language model, and Google planning to retire the Bard brand in favor of Gemini, highlighting rapid shifts across mobile, quantum, and AI domains.

GoogleOpen-source AIQuantum Computing

0 likes · 7 min read

WeChat’s App Bloats 1400×, China’s Quantum Computer Reaches 1M Users, AI2 Releases OLMo

Huolala Tech

Nov 23, 2023 · Artificial Intelligence

How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs

This article details HuoLaLa's development of an in‑house Automatic Speech Recognition system, covering its architecture, VAD optimization, language‑model and hot‑word enhancements, punctuation restoration, task and resource scheduling, and the resulting improvements in accuracy and cost efficiency.

ASRSpeech RecognitionVAD

0 likes · 18 min read

How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs

DataFunTalk

Nov 2, 2023 · Artificial Intelligence

Enhancing Language and Vision Models with External Knowledge and Tools: OREO‑LM, REVEAL, and AVIS

This article reviews recent research on augmenting language and multimodal models with external knowledge sources and tool‑calling mechanisms, covering three systems—OREO‑LM for knowledge‑graph reasoning, REVEAL for multi‑source visual‑language pretraining, and AVIS for dynamic tool selection—and their experimental results and implications.

Knowledge GraphMultimodalReasoning

0 likes · 28 min read

Enhancing Language and Vision Models with External Knowledge and Tools: OREO‑LM, REVEAL, and AVIS

DataFunSummit

Oct 17, 2023 · Artificial Intelligence

Enhancing Vision and Language Models with External Knowledge Graphs and Tool Integration

This article reviews recent research on augmenting language and vision models by incorporating external knowledge sources such as knowledge graphs, multi‑source retrieval, and dynamic tool‑calling frameworks, presenting three systems—OREO‑LM, REVEAL, and AVIS—and their experimental results.

AI researchMultimodalReasoning

0 likes · 27 min read

Enhancing Vision and Language Models with External Knowledge Graphs and Tool Integration

DataFunSummit

Sep 22, 2023 · Artificial Intelligence

Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions

This article reviews the evolution of game AI agents, examines how large language models (LLMs) can drive new AI behaviors in games, and discusses practical case studies across genres such as Werewolf‑style, war‑SLG, and MOBA games, concluding with challenges and future research directions.

AI agentsGame DevelopmentLLM

0 likes · 31 min read

Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions

Open Source Linux

Sep 8, 2023 · Artificial Intelligence

How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text

This article explains the inner workings of ChatGPT, covering how large language models predict the next token using probability distributions, the role of embeddings, the transformer architecture with attention heads, training methods, loss functions, and why such a massive neural network can produce coherent, human‑like language.

ChatGPTEmbeddingsMachine Learning

0 likes · 79 min read

How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text

58 Tech

Jun 21, 2023 · Artificial Intelligence

GPU Hotword Enhancement for WeNet End-to-End Speech Recognition

This article explains the design, implementation, and experimental evaluation of hot‑word augmentation in WeNet's GPU runtime, detailing how character‑ and word‑based language model scoring are extended to boost recognition of rare proper nouns in both streaming and non‑streaming ASR services.

ASRCTC decoderGPU

0 likes · 12 min read

GPU Hotword Enhancement for WeNet End-to-End Speech Recognition

Full-Stack Trendsetter

May 15, 2023 · Artificial Intelligence

Do You Really Understand ChatGPT, the Era‑Defining AI?

This article explains what ChatGPT is, how it builds on natural-language-processing and the Transformer-based GPT series, details its model-size growth, architectural enhancements, multilingual support, and walks through the tokenization-to-generation pipeline that enables coherent AI-driven conversations.

ChatGPTGPT-3NLP

0 likes · 8 min read

Do You Really Understand ChatGPT, the Era‑Defining AI?

21CTO

Apr 16, 2023 · Artificial Intelligence

Why ChatGPT Isn't a New Revolution: History, Tech, and Real Impact

In this talk, Wu Jun explains the decades‑long evolution of language models, why ChatGPT sparked hype yet isn’t a breakthrough, how massive compute and data power it, and what practical effects it has on creators, energy use, and the tech industry.

AI historyChatGPTcomputational cost

0 likes · 20 min read

Why ChatGPT Isn't a New Revolution: History, Tech, and Real Impact

dbaplus Community

Apr 15, 2023 · Artificial Intelligence

Why ChatGPT Isn't a New Revolution: Insights from AI Pioneer Wu Jun

In a live talk, AI veteran Wu Jun explains why the hype around ChatGPT is overblown, traces the history of language models from the 1970s, details the massive compute and data requirements, and discusses the real impact of large‑scale AI on society and work.

AI hypeChatGPTcomputational resources

0 likes · 20 min read

Why ChatGPT Isn't a New Revolution: Insights from AI Pioneer Wu Jun

Programmer DD

Apr 10, 2023 · Artificial Intelligence

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

In this talk, AI expert Wu Jun explains why ChatGPT has caused widespread fear, traces the historical development of language models from the 1970s to today, clarifies the massive computational and data requirements, and discusses the real impact and opportunities of large‑scale AI systems.

AI hypeChatGPTTechnology History

0 likes · 20 min read

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

Top Architect

Mar 1, 2023 · Artificial Intelligence

Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques

This article provides a comprehensive overview of how ChatGPT works, covering its probabilistic text generation, transformer architecture, embedding representations, neural network training processes, and the underlying principles that enable large language models to produce coherent and meaningful human-like language.

AIChatGPTEmbeddings

0 likes · 80 min read

Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques

IT Architects Alliance

Feb 23, 2023 · Artificial Intelligence

Training a Positive Review Generator with RLHF and PPO

This article demonstrates how to use Reinforcement Learning from Human Feedback (RLHF) with a PPO algorithm and a sentiment‑analysis model to train a language model that generates positive product reviews, covering task definition, data sampling, reward evaluation, model optimization, and experimental results.

GPTPPORLHF

0 likes · 11 min read

Training a Positive Review Generator with RLHF and PPO

Architect

Feb 19, 2023 · Artificial Intelligence

Training a Positive Review Generator with RLHF and PPO

This article demonstrates how to apply Reinforcement Learning from Human Feedback (RLHF) using a sentiment‑analysis model as a reward function and Proximal Policy Optimization (PPO) to fine‑tune a language model that generates positive product reviews, complete with code snippets and experimental results.

PPORLHFSentiment Analysis

0 likes · 10 min read

DataFunSummit

Feb 8, 2023 · Artificial Intelligence

Technical Architecture and Training Process of ChatGPT

ChatGPT, a dialogue-focused language model, builds on the GPT family and employs techniques such as Reinforcement Learning from Human Feedback (RLHF), the TAMER framework, and a three-stage training pipeline (supervised fine‑tuning, reward modeling, and PPO reinforcement learning) to achieve advanced conversational capabilities.

ChatGPTGPTRLHF

0 likes · 7 min read

Technical Architecture and Training Process of ChatGPT

MoonWebTeam

Dec 30, 2022 · Artificial Intelligence

What Makes ChatGPT So Powerful? A Deep Dive into Its Technology and Applications

ChatGPT, OpenAI’s conversational AI launched in December 2022, builds on GPT‑3 and advanced training methods like supervised fine‑tuning and reinforcement learning from human feedback, offering versatile applications from search assistance to code generation, while also revealing notable limitations and future commercial prospects.

AIApplicationsChatGPT

0 likes · 17 min read

What Makes ChatGPT So Powerful? A Deep Dive into Its Technology and Applications

Xiaohongshu Tech REDtech

Nov 11, 2022 · Artificial Intelligence

Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk

Prof. Qiu Xipeng’s talk highlighted how large language models can be offered as a service and efficiently adapted via in‑context learning, lightweight label‑tuning, and gradient‑free black‑box optimization, showcasing a unified asymmetric Transformer (CPT) that handles understanding, generation, ABSA and NER tasks while reducing resource demands.

Black-Box OptimizationLLMNLP

0 likes · 15 min read

Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk

DataFunTalk

Jul 30, 2021 · Artificial Intelligence

Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks

This article provides a comprehensive overview of natural language processing fundamentals, covering the challenges of language modeling, N‑gram and Markov assumptions, smoothing techniques such as discounting and add‑one, evaluation via perplexity, basic tasks like Chinese word segmentation, subword tokenization, POS tagging, syntactic and semantic parsing, and a range of downstream applications including information extraction, sentiment analysis, question answering, machine translation, and dialogue systems.

AINLPSubword Tokenization

0 likes · 29 min read

Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks

58 Tech

Dec 11, 2020 · Artificial Intelligence

Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization

This article explains the role of Weighted Finite State Transducers in conventional HMM‑based speech recognition, covering language models, pronunciation dictionaries, WFST definitions, semiring theory, composition and determinization operations, decoding graph construction (HCLG), lattice rescoring, and practical optimization techniques for real‑world scenarios.

ASROptimizationSpeech Recognition

0 likes · 23 min read

Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization

Sohu Tech Products

Nov 25, 2020 · Artificial Intelligence

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

This article provides a comprehensive, illustrated walkthrough of OpenAI's GPT‑2 language model, covering its decoder‑only Transformer architecture, self‑attention mechanisms, token processing, training data, differences from BERT, and applications beyond language modeling, enriched with visual diagrams and code snippets for deeper understanding.

AIGPT-2Self-Attention

0 likes · 24 min read

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

Didi Tech

Nov 5, 2020 · Artificial Intelligence

Self-Learning Platform for Speech Recognition Model Optimization at DiDi

DiDi’s self‑learning ASR platform lets non‑technical users upload business data, automatically train, test and deploy models with semi‑supervised learning, hot‑word updates and LSTM rescoring, creating a closed‑loop pipeline that boosted vehicle voice‑interaction accuracy from around 80 % to over 95 % within months.

AISemi-supervised Learningacoustic model

0 likes · 14 min read

Self-Learning Platform for Speech Recognition Model Optimization at DiDi

58 Tech

Mar 2, 2020 · Artificial Intelligence

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

This article proposes a method to identify low-quality text in business data by training a large-scale unsupervised language model to compute sentence perplexity, converting the detection problem into a threshold decision, and details model design, challenges, optimizations, and online performance results.

BERTNLPPerplexity

0 likes · 13 min read

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

DataFunTalk

Sep 3, 2019 · Artificial Intelligence

Forward Neural Networks and Their Applications in Language Modeling, Ranking, and Recommendation

This article excerpt explains the structure and training of feed‑forward neural networks, illustrates their use in neural language models, describes deep structured semantic models for ranking tasks, and details two‑stage recommendation systems such as YouTube, covering both theoretical formulas and practical deployment considerations.

Artificial Intelligenceforward neural networklanguage model

0 likes · 13 min read

Forward Neural Networks and Their Applications in Language Modeling, Ranking, and Recommendation

WeChat Backend Team

Sep 3, 2019 · Artificial Intelligence

How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition

This article presents a distributed system that efficiently supports large‑scale n‑gram language models for automatic speech recognition by introducing caching, a two‑level distributed index, batch processing, and a cascading fault‑tolerance mechanism, demonstrating robust scalability and low communication overhead in Tencent's WeChat ASR service.

CachingN-gramScaling

0 likes · 35 min read

How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition

Tencent Cloud Developer

Aug 25, 2019 · Artificial Intelligence

Understanding Intelligent Speech Recognition Technology

Intelligent speech recognition converts spoken audio to text using a pipeline of feature extraction, acoustic and language modeling, where deep neural networks—especially CNN, LSTM, and hybrid CLDNN architectures—drive high accuracy, enabling mobile voice input, call‑center transcription, legal record keeping, and Tencent Cloud ASR’s 97% Mandarin accuracy with speaker separation and on‑premises deployment.

AISpeech RecognitionTencent Cloud

0 likes · 7 min read

Understanding Intelligent Speech Recognition Technology

Tencent Cloud Developer

Jul 17, 2019 · Artificial Intelligence

Design and Implementation of a Multi‑Turn Conversational Chatbot

The article outlines the design and implementation of a multi‑turn conversational chatbot, detailing how natural‑language understanding converts user utterances into structured representations, a CNN‑LSTM language model classifies topics, intents, and sentiments, and an XML‑based answer engine orchestrates tasks and services for real‑world deployment.

AIChatbotconversation engine

0 likes · 9 min read

Design and Implementation of a Multi‑Turn Conversational Chatbot

58 Tech

Jun 27, 2019 · Artificial Intelligence

Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods

This article describes the design and implementation of a spelling‑correction module for 58.com’s search engine, covering common query errors, rule‑based and statistical language‑model approaches, offline dictionary generation, n‑gram and Viterbi decoding, online workflow, and practical examples.

Query ProcessingViterbi algorithmlanguage model

0 likes · 15 min read

Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods

58 Tech

Feb 20, 2019 · Artificial Intelligence

Building and Deploying Language Models for Text Quality Evaluation and Generation

This article explains the concepts, training pipeline, deployment formats, and practical applications of language models—particularly LSTM‑based models—for evaluating and generating text quality in a real‑world rental listing platform, highlighting data preparation, model training, and online serving techniques.

DeploymentLSTMPerplexity

0 likes · 16 min read

Building and Deploying Language Models for Text Quality Evaluation and Generation

58 Tech

Jan 22, 2019 · Artificial Intelligence

Chinese Word Segmentation: Challenges, Methods, and Practical Practices

The article explains why Chinese word segmentation is essential for NLP tasks, outlines its fundamental difficulties such as ambiguity and out‑of‑vocabulary words, reviews dictionary‑based, statistical, and CRF approaches, and shares practical experiences from 58 Search’s production system.

CRFNLPchinese segmentation

0 likes · 21 min read

Chinese Word Segmentation: Challenges, Methods, and Practical Practices

iQIYI Technical Product Team

Sep 14, 2018 · Artificial Intelligence

Limitations of Language Models in Voice Interaction and HomeAI Solutions

iQIYI HomeAI tackles the bottleneck of static language models in voice assistants by separating phonetic and semantic processing, correcting ASR errors at the intent‑recognition layer with pinyin‑enhanced entity correction, thereby reducing error amplification in video‑on‑demand interactions and paving the way for adaptive, personalized voice experiences.

AISpeech Recognitionintent recognition

0 likes · 7 min read

Limitations of Language Models in Voice Interaction and HomeAI Solutions

dbaplus Community

Nov 10, 2016 · Artificial Intelligence

Demystifying Recurrent Neural Networks: Theory, Training, and Implementation

This article explains the fundamentals of recurrent neural networks (RNNs), their role in language modeling, various RNN architectures such as bidirectional and deep RNNs, the back‑propagation through time (BPTT) training algorithm, gradient challenges, vectorization techniques, and provides a step‑by‑step code implementation.

BPTTRNNRecurrent Neural Network

0 likes · 21 min read

Demystifying Recurrent Neural Networks: Theory, Training, and Implementation