Artificial Intelligence 6 min read

Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting

The article introduces two 360 AI Research Institute projects—IAA, an architecture that equips frozen language models with multimodal capabilities via plug‑in layers, and BDM, a Chinese‑native diffusion model compatible with the Stable Diffusion ecosystem—detailing their motivations, designs, benchmark results, and open‑source resources.

360 Tech Engineering

Dec 17, 2024

Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting

Amid the wave of AI technology transformation, 360 Group’s AI Research Institute has released two papers accepted at AAAI: IAA (Inner‑Adaptor Architecture) for multimodal understanding and BDM (Bridge Diffusion Model) for multimodal generation.

IAA – Enabling Multimodal Ability in Language Models addresses two key challenges: (1) the catastrophic forgetting of the embedded language model when trained jointly on multimodal data, and (2) the lack of a plugin ecosystem for language models. IAA keeps the base language model parameters frozen and introduces multiple adaptor layers at various depths, allowing the model to acquire multimodal knowledge without losing its original textual competence. This plug‑in design also enables a single set of weights to serve both text‑only and multimodal tasks, reducing deployment cost and supporting extensions such as code or math plugins. Benchmark comparisons show that IAA maintains text performance while improving multimodal metrics.

BDM – Chinese‑Native AI Painting Compatible with the Stable Diffusion Ecosystem tackles (1) the English‑centric bias of existing diffusion models that hampers accurate Chinese image generation, and (2) the need for compatibility with the extensive Stable Diffusion community. BDM adopts a ControlNet‑like branching network (x‑language branches) that learns from language‑specific data, enabling native Chinese generation and easy extension to other languages. Trained on a 1‑billion‑scale Chinese image‑text dataset, BDM remains compatible with SD‑1.5 models while preserving Chinese cultural fidelity. Visual examples demonstrate strong alignment with various SD‑1.5 fine‑tuned styles.

Both projects are fully open‑sourced (IAA code at https://github.com/360CVGroup/Inner-Adaptor-Architecture, BDM code at https://github.com/360CVGroup/Bridge_Diffusion_Model) and the institute invites collaboration via its GitHub organization https://github.com/360CVGroup.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

multimodal AI Stable Diffusion language model Chinese AI painting open source research

Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.