Artificial Intelligence 9 min read

Intra‑Ensemble in Neural Networks

This paper proposes an intra‑ensemble strategy that trains multiple sub‑networks within a single neural network using random training operations, width‑depth variations, and parameter sharing, achieving diverse models and improved performance comparable to traditional ensembles while adding only marginal parameter overhead.

DataFunTalk

Feb 21, 2021

Background: Improving model performance is a core challenge in machine learning; deep networks suffer from edge effects when deep, and ensemble methods are effective for further gains.

Proposed Intra‑Ensemble: An end‑to‑end strategy that trains several sub‑networks inside one neural network with minimal extra parameters because most weights are shared. Random training increases sub‑network diversity, boosting ensemble effect.

Related Knowledge: Review of ensemble techniques (bagging, boosting, stacking) and neural architecture search (NAS) methods such as DARTS, ProxylessNAS, FBNet, which inspire parameter‑efficient designs.

Parameter Sharing: Sub‑networks share the majority of weights; only batch‑norm statistics are kept separate for each width to maintain stability.

Training Sub‑Networks: Define a list of width ratios and depth choices; use a single network to host sub‑networks of varying width and depth. A switched batch‑norm (S‑BN) allows training across widths with negligible parameter increase.

Random Training Operations: Four operations increase diversity: 1. Random Cut (RC) – mask a contiguous block of channels. 2. Random Offset (RO) – shift channel indices. 3. Shuffle Channel (SC) – randomly reorder channels. 4. Depth Operation – random skip (RS) of layers or shuffle layer (SL) order.

Similarity Metric: Defines similarity as the proportion of test images that produce identical outputs across sub‑networks, balancing accuracy and diversity.

Combination Strategies: Voting, averaging, and stacking are evaluated; stacking with random cut yields the best results.

Experiments: Extensive tests show that intra‑ensemble improves accuracy with only a slight parameter increase, outperforming many NAS‑derived models and matching traditional ensembles while being more resource‑efficient.

Conclusion: Intra‑ensemble combines multiple sub‑networks within a single model, leveraging random training and parameter sharing to achieve high‑accuracy, diverse ensembles with minimal overhead, and is effective across architectures and datasets.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

neural networks Architecture Search intra-ensemble Model Diversity parameter sharing

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.