Tagged articles

44 articles

Page 1 of 1

Feb 13, 2026 · Artificial Intelligence

How Attention Mechanisms Revolutionized Computer Vision and Machine Translation

This article traces the evolution of attention mechanisms from their inaugural application in computer vision and machine translation to their central role in modern Transformer models, detailing the underlying RNN‑Attention designs, the breakthrough in sequence alignment, and the innovations that enabled high‑performance, parallelizable deep learning architectures.

Transformerattention mechanismcomputer vision

0 likes · 14 min read

How Attention Mechanisms Revolutionized Computer Vision and Machine Translation

HyperAI Super Neural

Jan 9, 2026 · Artificial Intelligence

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

The article explains how Tencent's open‑source HY‑MT1.5 tackles the high‑cost, large‑parameter barrier of neural machine translation by offering a 1.8 B‑parameter model that runs on roughly 1 GB of RAM, processes 50 tokens in 0.18 s, supports 33 languages, and uses on‑policy distillation to retain top‑tier accuracy, while providing a step‑by‑step online demo and free compute credits for new users.

HY-MT1.5Tencentlarge language models

0 likes · 5 min read

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

Bilibili Tech

Oct 31, 2025 · Artificial Intelligence

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

RIVAL (Reinforcement Learning with Iterative and Adversarial Optimization) introduces an adversarial game between a reward model and a translation LLM, combining qualitative preference rewards with quantitative metrics like BLEU, to overcome distribution shift in RLHF and achieve superior performance on conversational subtitle and WMT translation tasks.

BLEULLMReward Modeling

0 likes · 13 min read

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

Tencent Technical Engineering

Apr 16, 2025 · Artificial Intelligence

Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide

This practical guide walks through the full Transformer architecture for Chinese‑to‑English translation, detailing encoder‑decoder structure, tokenization and embeddings, batch handling with padding and masks, positional encodings, parallel teacher‑forcing, self‑ and multi‑head attention, and the complete forward and back‑propagation training steps.

Positional EncodingPyTorchSelf-Attention

0 likes · 26 min read

Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide

vivo Internet Technology

Feb 12, 2025 · Artificial Intelligence

Bidirectional Optimization of NLLB-200 and ChatGPT for Low-Resource Language Translation

The paper proposes a bidirectional optimization framework that fine‑tunes the low‑resource NLLB‑200 translation model with LoRA using data generated by ChatGPT, while also translating low‑resource prompts with NLLB before feeding them to LLMs, thereby improving multilingual translation quality yet requiring careful validation of noisy synthetic data.

LLMLoRANLLB

0 likes · 28 min read

Bidirectional Optimization of NLLB-200 and ChatGPT for Low-Resource Language Translation

Test Development Learning Exchange

Jan 13, 2025 · Artificial Intelligence

Python Tool for Converting English Videos to Chinese Dubbed Videos with Subtitles

This article provides a comprehensive guide on developing a Python tool to convert English videos into versions with Chinese dubbing and subtitles, covering all steps from audio extraction to final synthesis.

AI toolsPythonSpeech Recognition

0 likes · 5 min read

Python Tool for Converting English Videos to Chinese Dubbed Videos with Subtitles

Baobao Algorithm Notes

Nov 24, 2024 · Artificial Intelligence

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

The article introduces Marco‑o1, an open‑source LLM that enhances complex reasoning by fine‑tuning on Chain‑of‑Thought data, integrating Monte‑Carlo Tree Search, introducing mini‑step actions and a reflection mechanism, and evaluates its performance on multilingual math and translation benchmarks.

Artificial IntelligenceLLMMonte Carlo Tree Search

0 likes · 15 min read

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

System Architect Go

Oct 24, 2024 · Artificial Intelligence

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

This article walks through the complete process of fine‑tuning both domain‑specific and large‑language translation models on Kubernetes documentation, covering data preparation, model selection, training configurations, the differences between Seq2Seq and CausalLM, and how LoRA can dramatically reduce resource usage while improving performance.

AILLMLoRA

0 likes · 7 min read

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

Baidu Tech Salon

Jun 24, 2024 · Artificial Intelligence

Paperpolisher: AI-Powered Academic Paper Translation and Polishing Assistant

Paperpolisher is an AI-powered tool using Baidu's ERNIE large model and Comate to translate and polish Chinese academic papers into high-quality English, leveraging large paper datasets and retrieval augmentation, streamlining code generation and improving acceptance chances for submissions to top conferences.

AI coding assistantBaidu ComateERNIE large model

0 likes · 9 min read

Paperpolisher: AI-Powered Academic Paper Translation and Polishing Assistant

Rare Earth Juejin Tech Community

Jun 12, 2024 · Artificial Intelligence

A Simple Introduction to the Transformer Model

This article provides a comprehensive, beginner-friendly explanation of the Transformer architecture, covering its encoder‑decoder structure, self‑attention, multi‑head attention, positional encoding, residual connections, decoding process, final linear and softmax layers, and training considerations, illustrated with numerous diagrams and code snippets.

Self-AttentionTransformerdeep learning

0 likes · 24 min read

A Simple Introduction to the Transformer Model

DataFunSummit

Mar 3, 2024 · Artificial Intelligence

Instruction Fine-Tuning Practices for Huawei's Pangu Large Language Model

This presentation details the concepts, methodologies, and experimental results of instruction fine‑tuning for Huawei's Pangu large language model, covering model scale, architecture, training strategies, data quality, parallelism techniques, and case studies on Chinese‑English translation and Thai language adaptation.

Efficient Fine-Tuninginstruction fine-tuningmachine translation

0 likes · 19 min read

Instruction Fine-Tuning Practices for Huawei's Pangu Large Language Model

DataFunTalk

Sep 19, 2023 · Artificial Intelligence

Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges

This article reviews the technical background of simultaneous speech translation, compares offline and real‑time scenarios, details ASR and MT technologies, describes the system architecture and design strategies, and discusses the major challenges and solutions for deploying robust, low‑latency translation services.

ASRHuaweiMultimodal

0 likes · 16 min read

Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges

21CTO

Apr 27, 2023 · Artificial Intelligence

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

This article explains the Transformer model—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional encoding, residual connections, training loss, and inference strategies—providing a clear, visual walkthrough for readers new to modern NLP architectures.

Self-AttentionTransformerdeep learning

0 likes · 21 min read

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

NetEase LeiHuo Testing Center

Mar 31, 2023 · Artificial Intelligence

Comparative Evaluation of Deepl and ChatGPT Machine Translation for Game Localization

This article investigates the translation quality of Deepl and ChatGPT for the game 'Naraka: Bladepoint' by comparing their outputs against professional human translations across Chinese‑English, Chinese‑Spanish, and English‑Spanish pairs using BLEU scores and manual assessment, revealing strengths and limitations of each system.

AIGCBLEUChatGPT

0 likes · 12 min read

Comparative Evaluation of Deepl and ChatGPT Machine Translation for Game Localization

Model Perspective

Nov 17, 2022 · Artificial Intelligence

How Mathematics Sparked the Rise of Modern Linguistics and NLP

This article traces the historical convergence of mathematics and linguistics, from 19th‑century pioneers to post‑war computer‑driven research, highlighting how statistical, probabilistic, and formal methods laid the foundation for machine translation, morphological analysis, and contemporary natural language processing.

history of linguisticsmachine translationmathematical linguistics

0 likes · 7 min read

How Mathematics Sparked the Rise of Modern Linguistics and NLP

DataFunTalk

Sep 27, 2022 · Artificial Intelligence

Contrastive Learning for Text Generation: Motivation, Methodology, Experiments, and Discussion (CoNT Framework)

This article reviews the integration of contrastive learning into text generation, explains why it helps mitigate exposure bias, introduces the CoNT framework with three key improvements, presents extensive experiments on translation, summarization, code comment and data‑to‑text tasks, and discusses practical deployment considerations.

AICoNTText Generation

0 likes · 21 min read

Contrastive Learning for Text Generation: Motivation, Methodology, Experiments, and Discussion (CoNT Framework)

DataFunTalk

Jul 30, 2022 · Artificial Intelligence

Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022

This article reviews the IWSLT 2022 competition tasks, explains Huawei’s cascade offline speech‑to‑text translation pipeline, details four major technical innovations—including ensemble‑based ASR de‑noise, context‑aware re‑ranking, domain‑controlled training, and length‑control strategies—and presents experimental results that demonstrate Huawei’s leading performance across multiple language directions.

ASRHuaweiIWSLT

0 likes · 18 min read

Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022

21CTO

Jul 9, 2022 · Artificial Intelligence

Meta Unveils NLLB-200: Open‑Source AI Model Translating 200 Languages

Meta has open‑sourced its new NLLB‑200 model, a single AI system that translates 200 languages with up to 44 % higher quality than its predecessor, supporting numerous low‑resource languages and powering billions of daily translations across Facebook and Instagram to improve user experience and content safety.

MetaNLLB-200machine translation

0 likes · 3 min read

Meta Unveils NLLB-200: Open‑Source AI Model Translating 200 Languages

DataFunTalk

Jan 16, 2022 · Artificial Intelligence

DeltaLM: A Multilingual Pretrained Encoder‑Decoder Model for Neural Machine Translation and Zero‑Shot Transfer

DeltaLM is a new multilingual pretrained encoder‑decoder model that leverages a pretrained encoder and a novel decoder to improve multilingual neural machine translation, offering efficient training, strong cross‑language transfer, zero‑shot translation, and superior performance on various translation and summarization tasks.

DeltaLMNMTmachine translation

0 likes · 13 min read

DeltaLM: A Multilingual Pretrained Encoder‑Decoder Model for Neural Machine Translation and Zero‑Shot Transfer

DataFunSummit

Nov 18, 2021 · Artificial Intelligence

Enterprise Applications and Research of Speech Translation

This article reviews recent advances in speech translation, discusses ByteDance's practical deployments, compares cascade and end‑to‑end modeling approaches, introduces improved encoder‑decoder architectures and training strategies, and reports state‑of‑the‑art results on the IWSLT 2021 benchmark.

AIByteDanceEnd-to-End

0 likes · 15 min read

Enterprise Applications and Research of Speech Translation

DataFunTalk

Oct 5, 2021 · Artificial Intelligence

From Technology to Experience: Vivo Machine Translation Deployment Practice

This article presents a comprehensive guide to deploying machine translation at Vivo, covering business analysis, algorithm choices beyond standard NMT, language detection challenges, data collection and cleaning, scientific evaluation methods, and engineering optimizations to deliver a seamless user experience.

AIEngineeringNMT

0 likes · 20 min read

From Technology to Experience: Vivo Machine Translation Deployment Practice

Volcano Engine Developer Services

Sep 25, 2021 · Artificial Intelligence

Cutting‑Edge AI from ByteDance & OPPO: Audio, NLP, and Translation

The ByteDance Engine Developer Community Meetup featured senior engineers from ByteDance and OPPO who presented the latest advances in intelligent audio signal processing, natural language processing for recommendation, entity linking in knowledge graphs, and multimedia machine translation, highlighting practical applications and performance challenges.

Artificial IntelligenceRecommendation Systemsknowledge graph

0 likes · 4 min read

Cutting‑Edge AI from ByteDance & OPPO: Audio, NLP, and Translation

Tencent Tech

Jul 22, 2021 · Artificial Intelligence

How Tencent Dominated WMT2021: Winning Five News‑Track Translation Tasks

Tencent’s machine‑translation teams clinched five first‑place wins in the WMT2021 news track—covering Chinese‑English, Japanese‑English and English‑German limited‑resource tasks—outperforming 82 competing teams and showcasing the impact of its AI‑driven translation engine across its products.

AI competitionBLEUTencent

0 likes · 4 min read

How Tencent Dominated WMT2021: Winning Five News‑Track Translation Tasks

iQIYI Technical Product Team

Jul 9, 2021 · Artificial Intelligence

iQIYI Multi‑Language Subtitle Machine Translation: Practice, Model Exploration, and Deployment

iQIYI’s multi‑language subtitle machine‑translation system combines a one‑to‑many transformer, context‑fusion encoding, four custom attention masks, masked language modeling, global decoding loss, reconstruction and error‑correction modules, plus pronoun, idiom and name‑handling tricks, achieving higher quality than third‑party services and even surpassing human translation for several languages.

Error CorrectionOne-to-Many ModelSubtitle Translation

0 likes · 17 min read

iQIYI Multi‑Language Subtitle Machine Translation: Practice, Model Exploration, and Deployment

DataFunTalk

Feb 20, 2021 · Artificial Intelligence

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

This article presents Bytedance's industrial machine‑translation platform, describing its global deployment, diverse product demos, underlying sequence‑to‑sequence models, BERT‑enhanced training strategies, prune‑tune sparsity techniques, multilingual pre‑training, document translation, and a high‑performance inference engine.

BERTmachine translationmultilingual NLP

0 likes · 19 min read

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

DataFunTalk

Feb 9, 2021 · Artificial Intelligence

Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation

This article surveys recent multimodal AI research, covering video scene‑aware dialog with a GPT‑2 based unified pre‑training framework, dual‑channel multi‑hop reasoning for visual dialog, capsule‑network‑enhanced multimodal machine translation, and graph‑neural‑network‑driven multimodal translation, highlighting experimental results and future directions.

Graph Neural NetworkMultimodal Learningmachine translation

0 likes · 12 min read

Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation

New Oriental Technology

Feb 1, 2021 · Artificial Intelligence

Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements

This article explains key concepts of neural machine translation, covering Seq2Seq encoder‑decoder models, beam search strategies, BLEU evaluation, various attention mechanisms, and the enhancements introduced in Google's Neural Machine Translation system to improve speed, OOV handling, and translation quality.

BLEUBeam SearchGNMT

0 likes · 11 min read

Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements

DataFunTalk

Jan 10, 2021 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

This article presents a comprehensive overview of Didi's machine translation platform, covering its evolution from statistical to neural models, the Transformer architecture with relative position and larger FFN, data preparation, training strategies such as back‑translation and knowledge distillation, deployment optimizations with TensorRT, and the team's successful participation in the WMT2020 news translation task.

BLEUKnowledge DistillationTensorRT

0 likes · 14 min read

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Ctrip Technology

Nov 12, 2020 · Artificial Intelligence

Ctrip Machine Translation Platform: Architecture, Data Construction, Algorithm Design, and Performance Optimization

This article presents a comprehensive overview of Ctrip's multilingual machine translation platform, detailing demand analysis, system architecture, data pipeline, algorithmic innovations such as task‑space fusion and term‑translation interventions, as well as extensive performance optimizations for low‑resource languages.

AICtripdata pipeline

0 likes · 20 min read

Ctrip Machine Translation Platform: Architecture, Data Construction, Algorithm Design, and Performance Optimization

Didi Tech

Oct 27, 2020 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Didi's machine translation system combines a Transformer‑big architecture with relative position representations, enlarged feed‑forward networks, iterative back‑translation, knowledge‑distillation and domain fine‑tuning, optimized via TensorRT for speed, achieving a BLEU 36.6 and third place in the WMT2020 Chinese‑to‑English news task.

BLEUKnowledge DistillationTensorRT

0 likes · 15 min read

DataFunTalk

May 6, 2020 · Artificial Intelligence

Application of Large-Scale Pretrained Models in Alibaba Machine Translation

This article reviews how large‑scale pretrained language models have reshaped NLP, outlines the challenges of applying them to machine translation, introduces the APT framework and the GRET architecture for better encoder‑decoder integration, and reports experimental gains and future research directions.

AIAPT frameworkGRET

0 likes · 10 min read

Application of Large-Scale Pretrained Models in Alibaba Machine Translation

21CTO

Apr 20, 2020 · Artificial Intelligence

Why DeepL’s Neural Translation Beats Google: Inside the AI Engine

This article examines DeepL’s translation system, comparing its neural‑network‑driven output to Google and other services, detailing its Icelandic HPC infrastructure, data collection, architectural choices, language support, strengths, limitations, and expert opinions on why it often delivers more natural translations.

AIHPCcomparison

0 likes · 9 min read

Why DeepL’s Neural Translation Beats Google: Inside the AI Engine

DataFunTalk

Apr 10, 2020 · Artificial Intelligence

Improving Machine Translation: Addressing Exposure Bias, Efficient Decoding, and Non‑Autoregressive Models

This article reviews recent research on machine translation that tackles the training‑inference distribution gap, exposure bias, and slow autoregressive decoding by introducing scheduled sampling, differentiable sequence‑level losses, cube‑pruning, and sequence‑aware non‑autoregressive decoding, showing BLEU gains and significant speedups.

BLEUNLPcube pruning

0 likes · 16 min read

Improving Machine Translation: Addressing Exposure Bias, Efficient Decoding, and Non‑Autoregressive Models

Qunar Tech Salon

Sep 12, 2019 · Artificial Intelligence

A Comprehensive Overview of Attention Mechanisms in Deep Learning

This article systematically reviews the history, core concepts, variants, and practical implementations of attention mechanisms—from early additive and multiplicative forms to self‑attention, multi‑head attention, and recent transformer‑based models—highlighting why attention has become fundamental in modern AI research.

NLPSelf-AttentionTransformer

0 likes · 16 min read

A Comprehensive Overview of Attention Mechanisms in Deep Learning

Liulishuo Tech Team

May 24, 2019 · Artificial Intelligence

Grammatical Error Correction (GEC): Definition, Challenges, Evaluation, and Solutions

This article introduces Grammatical Error Correction (GEC), explains its main error categories, outlines four key challenges, reviews evaluation metrics and the evolution of NLP approaches, and showcases practical solutions and product applications developed by Liulishuo.

Grammatical Error CorrectionNLPdeep learning

0 likes · 11 min read

Grammatical Error Correction (GEC): Definition, Challenges, Evaluation, and Solutions

Ctrip Technology

May 21, 2019 · Artificial Intelligence

A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights

This article surveys the evolution of machine translation from early rule‑based systems to modern neural architectures, explains how translation engines are trained, highlights recent advances such as attention and Transformers, and shares practical experience and current challenges in the field.

Artificial IntelligenceTransformerattention mechanism

0 likes · 11 min read

A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights

ITPUB

Mar 6, 2019 · Artificial Intelligence

Why WeChat’s Translation Glitches Reveal Hidden AI Challenges

A recent WeChat translation bug that turned a name into bizarre Chinese phrases sparked a deep dive into neural machine translation, exposing algorithmic shortcomings, training‑data biases, and the broader uncertainties that affect modern AI‑driven translators.

AINMTmachine translation

0 likes · 10 min read

Why WeChat’s Translation Glitches Reveal Hidden AI Challenges

DataFunTalk

Mar 6, 2019 · Artificial Intelligence

Baidu Chinese Text Correction Technology Overview

This article presents a comprehensive overview of Baidu's Chinese text correction technology, covering its background, error types, system architecture, key detection, candidate recall and ranking methods, core language and knowledge techniques, and real-world applications in open-domain and scenario-specific contexts.

Baidumachine translationtext correction

0 likes · 13 min read

Baidu Chinese Text Correction Technology Overview

DataFunTalk

Feb 27, 2019 · Artificial Intelligence

Human‑Interactive Machine Translation: Research, Techniques, and Productization

This article reviews the current state of machine translation, explores the challenges of ambiguity, quality, and domain specificity, and presents human‑in‑the‑loop translation techniques—including attention‑enhanced models, transformer architectures, and online learning—while discussing practical productization and deployment considerations.

AI productizationHuman-in-the-LoopOnline Learning

0 likes · 16 min read

Human‑Interactive Machine Translation: Research, Techniques, and Productization

Hulu Beijing

Sep 27, 2018 · Artificial Intelligence

From Rules to Neural Networks: The Evolution of Machine Translation

This article traces the history of machine translation—from early rule‑based systems through statistical models that leveraged parallel corpora, to modern neural network approaches—while highlighting current applications, challenges, and future directions in the field.

AI applicationsmachine translationnatural language processing

0 likes · 9 min read

From Rules to Neural Networks: The Evolution of Machine Translation

AntTech

Aug 1, 2018 · Artificial Intelligence

Highlights and Paper Summaries from ACL 2018 Conference

An extensive overview of ACL 2018, featuring acceptance statistics, award-winning papers, tutorial insights, and concise summaries of notable research across machine translation, semantic parsing, question answering, domain adaptation, text classification, summarization, dialogue systems, generation, and related tools.

ACL 2018Dialogue SystemsNLP

0 likes · 12 min read

Highlights and Paper Summaries from ACL 2018 Conference

Hulu Beijing

Dec 20, 2017 · Artificial Intelligence

How Attention Mechanisms Transform Seq2Seq Models for Better Translation

This article explains why attention mechanisms were introduced into Seq2Seq models, how they address the limitations of fixed‑length encoding, the role of bidirectional RNNs, and showcases their impact on machine translation and image captioning with illustrative diagrams.

RNNSeq2Seqattention mechanism

0 likes · 10 min read

How Attention Mechanisms Transform Seq2Seq Models for Better Translation

Alibaba Cloud Developer

Sep 20, 2016 · Artificial Intelligence

What ACL 2016 Tutorials Reveal About the Future of NLP and Deep Learning

The article reviews ACL 2016’s tutorial program, summarizing key talks on computer‑aided translation, neural machine translation, semantic sense representation, short‑text understanding, and highlights selected papers on multimodal translation, coverage modeling, and language‑vision grounding, illustrating deep learning’s impact on NLP research.

ACL 2016NLPdeep learning

0 likes · 13 min read

What ACL 2016 Tutorials Reveal About the Future of NLP and Deep Learning

Architects Research Society

Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Generative Modelsbayesian inferencemachine translation

0 likes · 4 min read

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data