Instant LoRA Generation and Long‑Document Internalization: Cost‑Amortized Model Updates via 0.1‑Second Forward Pass

The article analyzes the quadratic attention and KV‑Cache bottlenecks of Transformers on ultra‑long inputs and the heavy compute cost of traditional supervised fine‑tuning, then presents Sakana AI's Cost Amortization framework—Doc‑to‑LoRA and Text‑to‑LoRA—that shifts weight updates to a meta‑training hypernetwork, achieving sub‑50 MB memory for 128K‑token inference, sub‑GB update memory for long‑document QA, and zero‑shot task adaptation with sub‑second latency.

Cost AmortizationCross-modalLoRA

0 likes · 13 min read

Instant LoRA Generation and Long‑Document Internalization: Cost‑Amortized Model Updates via 0.1‑Second Forward Pass

DataFunSummit

Jan 20, 2024 · Artificial Intelligence

Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications

The article presents a comprehensive overview of cross‑modal video open‑tag mining, detailing its technical background, related multimodal research methods, a four‑stage open‑tag solution from 360 AI Research Institute, and future application prospects such as unsupervised tag coverage, semantic retrieval, and content moderation.

Cross-modallabel extractionmultimodal AI

0 likes · 15 min read

Cross‑Modal Video Open‑Tag Mining: Techniques, Methods, and Applications

360 Tech Engineering

Jul 6, 2023 · Artificial Intelligence

CSIG Enterprise Visit to Qihoo 360: Multimodal and Cross‑Modal Learning in the Era of Large Models

The CSIG‑hosted "Enterprise Visit – Into Qihoo 360" event on June 29, 2023 gathered over a thousand participants to explore multimodal and cross‑modal learning in the large‑model era, featuring keynote speeches from leading university researchers and Qihoo 360 AI experts, a tour of the company's facilities, and discussions on future AI research directions.

CSIGCross-modalLarge Models

0 likes · 8 min read

CSIG Enterprise Visit to Qihoo 360: Multimodal and Cross‑Modal Learning in the Era of Large Models

DataFunTalk

Sep 24, 2022 · Artificial Intelligence

Cross‑Modal Image‑Text Representation: The Zero Dataset and R2D2 Pre‑training Framework

This article introduces the importance of image‑text cross‑modal representation, presents the Chinese Zero dataset with two pre‑training subsets and five downstream tasks, describes the R2D2 dual‑tower‑plus‑single‑tower pre‑training framework with multiple loss functions, and reports extensive experiments and real‑world deployment insights.

Cross-modalR2D2 frameworkZero dataset

0 likes · 19 min read

Cross‑Modal Image‑Text Representation: The Zero Dataset and R2D2 Pre‑training Framework

DataFunTalk

Jun 16, 2022 · Artificial Intelligence

BigBang Transformer (BBT): A 1‑Billion‑Parameter Financial Pre‑trained Language Model with Time‑Series‑Text Cross‑Modal Architecture

The BigBang Transformer (BBT) is a 1‑billion‑parameter financial pre‑trained language model that combines text and time‑series data in a cross‑modal Transformer architecture, achieving up to 10% higher downstream accuracy than T5‑scale models and demonstrating strong performance on financial NLP tasks, time‑series forecasting, and multi‑factor investment strategies.

Artificial IntelligenceCross-modalFinancial NLP

0 likes · 19 min read

BigBang Transformer (BBT): A 1‑Billion‑Parameter Financial Pre‑trained Language Model with Time‑Series‑Text Cross‑Modal Architecture