Artificial Intelligence 10 min read

LMNet: Enabling Language Models to Self‑Organize into Networks

The paper introduces Language Model Networks (LMNet), a framework that lets pretrained large language models act as reusable compute nodes communicating via dense, trainable vectors, showing measurable performance gains on general and supervised adaptation tasks with minimal extra training cost.

Machine Heart

May 31, 2026

LMNet: Enabling Language Models to Self‑Organize into Networks

From Bigger Models to Collaborative Systems

Recent years have focused on scaling large language models—more parameters, data, longer context, stronger training—yielding capability jumps and widespread deployment. However, as tasks become more complex and require division of labor, a single monolithic model faces limits, needing to handle planning, reasoning, retrieval, verification, tool use, and generation simultaneously.

LMNet proposes viewing pretrained language models not as isolated predictors but as reusable compute nodes whose connections, communication, and cooperation become a source of intelligence. In other words, AI ability stems not only from how strong a model is, but also from how the models are organized.

Why Natural‑Language Interaction Is Insufficient

Current multi‑model collaborations often let one model generate text that another reads and continues, a simple and human‑readable approach. Yet natural language is a discrete, symbolic medium; each exchange requires converting internal representations to text and back, causing possible information loss and breaking gradient flow, which hampers end‑to‑end optimization.

The key challenge is not merely prompt engineering but making the communication itself a learnable object.

LMNet: Building a “Model‑Level Neural Network” on Top of LLMs

LMNet treats each pretrained language model as a reusable node and introduces trainable communication modules (e.g., attention blocks) as edges, forming a neural network of models. The outermost interface remains natural‑language input and output, but intermediate nodes exchange dense continuous vectors directly, bypassing repeated text generation and comprehension.

This design lets the system automatically learn what information to pass between nodes under supervision, without hand‑crafted prompts or fixed role assignments.

Learning Communication End‑to‑End

Because communication is parameterized and differentiable, LMNet can adjust the flow of information between nodes via gradient descent driven by the final task’s supervision signal. The system learns “who should send what to whom” without explicit annotations.

Thus, LMNet shifts AI system design from prompting a single model to organizing a network of models that can self‑configure their communication.

Experimental Results: Small Extra Cost, Noticeable Gains

Using Qwen2.5‑0.5B as the base node, the authors built a 1‑layer‑4‑layer‑4‑layer‑1 topology (four communication layers, 14 shared‑parameter nodes) totaling ~1.14 B parameters (LMNet‑1B). With less than 0.1 T additional training tokens—only 0.2 % of the base model’s pre‑training cost—LMNet achieved clear improvements across several general tasks (see Figure 3).

When compared against test‑time scaling methods that keep inference cost similar, LMNet still showed a performance edge (Figure 4).

In limited‑supervision adaptation, smaller LMNets froze the large‑model node parameters and trained only the communication edges to avoid over‑fitting. Compared with standard fine‑tuning and parameter‑efficient fine‑tuning (PEFT) methods, LMNet consistently outperformed them on benchmarks such as MMLU and E2E datasets (Figures 5‑6).

These numbers demonstrate that learnable inter‑model communication can be an effective route to boost system capability.

From Monolithic Intelligence to Networked Intelligence

The work suggests a future where AI systems consist of multiple models, tools, memory, and feedback modules forming a learnable network, rather than a single ever‑larger model. Intelligence would emerge from both individual module strength and the way modules connect, communicate, and co‑adapt.

Recent research from Google DeepMind, AWS Agentic AI, and others also highlights model‑to‑model communication media, topology, and learnable interfaces as key directions for next‑generation AI.

Paper title: Language Model Networks: Supervision‑Efficient Learning through Dense Communication

Paper link: https://arxiv.org/abs/2505.12741

Figure 1: Dense continuous vectors enable efficient model‑to‑model communication compared with discrete natural language

Figure 2: LMNet architecture diagram showing language models as nodes and attention blocks as edges

Figure 3: Performance comparison of LMNet‑1B with similarly sized LLMs

Figure 4: Test‑time scaling methods for Qwen2.5‑0.5B and their performance

Figure 5: MMLU fine‑tuning results with different LLM bases

Figure 6: PEFT methods on GPT2‑M evaluated on an E2E dataset

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

ICML 2026 dense communication Language Model Networks LLM collaboration LMNet

Written by

Machine Heart

Professional AI media and industry service platform

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.