Tag

visual dialog

1 views collected around this technical thread.

DataFunTalk
DataFunTalk
Feb 9, 2021 · Artificial Intelligence

Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation

This article surveys recent multimodal AI research, covering video scene‑aware dialog with a GPT‑2 based unified pre‑training framework, dual‑channel multi‑hop reasoning for visual dialog, capsule‑network‑enhanced multimodal machine translation, and graph‑neural‑network‑driven multimodal translation, highlighting experimental results and future directions.

graph neural networkmachine translationmultimodal AI
0 likes · 12 min read
Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation