Tag

cross‑modal

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Jan 20, 2024 · Artificial Intelligence

Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications

The article presents a comprehensive overview of cross‑modal video open‑tag mining, detailing its technical background, related multimodal research methods, a four‑stage open‑tag solution from 360 AI Research Institute, and future application prospects such as unsupervised tag coverage, semantic retrieval, and content moderation.

cross‑modallabel extractionmultimodal AI
0 likes · 15 min read
Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications
360 Tech Engineering
360 Tech Engineering
Jul 6, 2023 · Artificial Intelligence

CSIG Enterprise Visit to Qihoo 360: Multimodal and Cross‑Modal Learning in the Era of Large Models

The CSIG‑hosted "Enterprise Visit – Into Qihoo 360" event on June 29, 2023 gathered over a thousand participants to explore multimodal and cross‑modal learning in the large‑model era, featuring keynote speeches from leading university researchers and Qihoo 360 AI experts, a tour of the company's facilities, and discussions on future AI research directions.

CSIGConferenceMultimodal
0 likes · 8 min read
CSIG Enterprise Visit to Qihoo 360: Multimodal and Cross‑Modal Learning in the Era of Large Models
DataFunTalk
DataFunTalk
Sep 24, 2022 · Artificial Intelligence

Cross‑Modal Image‑Text Representation: The Zero Dataset and R2D2 Pre‑training Framework

This article introduces the importance of image‑text cross‑modal representation, presents the Chinese Zero dataset with two pre‑training subsets and five downstream tasks, describes the R2D2 dual‑tower‑plus‑single‑tower pre‑training framework with multiple loss functions, and reports extensive experiments and real‑world deployment insights.

R2D2 frameworkZero datasetcross‑modal
0 likes · 19 min read
Cross‑Modal Image‑Text Representation: The Zero Dataset and R2D2 Pre‑training Framework
DataFunTalk
DataFunTalk
Jun 16, 2022 · Artificial Intelligence

BigBang Transformer (BBT): A 1‑Billion‑Parameter Financial Pre‑trained Language Model with Time‑Series‑Text Cross‑Modal Architecture

The BigBang Transformer (BBT) is a 1‑billion‑parameter financial pre‑trained language model that combines text and time‑series data in a cross‑modal Transformer architecture, achieving up to 10% higher downstream accuracy than T5‑scale models and demonstrating strong performance on financial NLP tasks, time‑series forecasting, and multi‑factor investment strategies.

artificial intelligencebig datacross‑modal
0 likes · 19 min read
BigBang Transformer (BBT): A 1‑Billion‑Parameter Financial Pre‑trained Language Model with Time‑Series‑Text Cross‑Modal Architecture