DataFunTalk
Feb 9, 2021 · Artificial Intelligence
Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation
This article surveys recent multimodal AI research, covering video scene‑aware dialog with a GPT‑2 based unified pre‑training framework, dual‑channel multi‑hop reasoning for visual dialog, capsule‑network‑enhanced multimodal machine translation, and graph‑neural‑network‑driven multimodal translation, highlighting experimental results and future directions.
graph neural networkmachine translationmultimodal AI
0 likes · 12 min read