Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting
The article introduces two 360 AI Research Institute projects—IAA, an architecture that equips frozen language models with multimodal capabilities via plug‑in layers, and BDM, a Chinese‑native diffusion model compatible with the Stable Diffusion ecosystem—detailing their motivations, designs, benchmark results, and open‑source resources.
Amid the wave of AI technology transformation, 360 Group’s AI Research Institute has released two papers accepted at AAAI: IAA (Inner‑Adaptor Architecture) for multimodal understanding and BDM (Bridge Diffusion Model) for multimodal generation.
IAA – Enabling Multimodal Ability in Language Models addresses two key challenges: (1) the catastrophic forgetting of the embedded language model when trained jointly on multimodal data, and (2) the lack of a plugin ecosystem for language models. IAA keeps the base language model parameters frozen and introduces multiple adaptor layers at various depths, allowing the model to acquire multimodal knowledge without losing its original textual competence. This plug‑in design also enables a single set of weights to serve both text‑only and multimodal tasks, reducing deployment cost and supporting extensions such as code or math plugins. Benchmark comparisons show that IAA maintains text performance while improving multimodal metrics.
BDM – Chinese‑Native AI Painting Compatible with the Stable Diffusion Ecosystem tackles (1) the English‑centric bias of existing diffusion models that hampers accurate Chinese image generation, and (2) the need for compatibility with the extensive Stable Diffusion community. BDM adopts a ControlNet‑like branching network (x‑language branches) that learns from language‑specific data, enabling native Chinese generation and easy extension to other languages. Trained on a 1‑billion‑scale Chinese image‑text dataset, BDM remains compatible with SD‑1.5 models while preserving Chinese cultural fidelity. Visual examples demonstrate strong alignment with various SD‑1.5 fine‑tuned styles.
Both projects are fully open‑sourced (IAA code at https://github.com/360CVGroup/Inner-Adaptor-Architecture , BDM code at https://github.com/360CVGroup/Bridge_Diffusion_Model ) and the institute invites collaboration via its GitHub organization https://github.com/360CVGroup .
360 Tech Engineering
Official tech channel of 360, building the most professional technology aggregation platform for the brand.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.