Understanding gcForest: Cascade Forest Structure and Multi‑grained Scanning for Representation Learning
The article explains how gcForest, an ensemble‑of‑decision‑tree model that mimics deep neural network hierarchies, uses cascade forests and multi‑grained sliding‑window scanning to achieve effective representation learning with fewer hyper‑parameters, especially on small datasets.
Dr. Zeng Fanshiang, a Ph.D. from Beijing University of Posts and Telecommunications and a big‑data algorithm engineer at Qunar.com, focuses on machine learning and deep learning research.
Both modern neural networks and classic models such as LR, SVM, and Random Forest can be viewed as forms of representation learning; deep networks excel because their layered weights progressively transform low‑level features into high‑level semantic representations that align well with image, speech, and text data.
However, deep learning often requires extensive hyper‑parameter tuning and large datasets, prompting the exploration of gcForest—a decision‑tree‑based ensemble that retains strong representation capabilities while drastically reducing the number of tunable hyper‑parameters.
The core components of gcForest are the cascade forest and multi‑grained scanning. The cascade forest mimics the hierarchical structure of neural networks: each level consists of several forests (both completely random and standard random forests) that take the previous level’s outputs as additional inputs, producing class‑probability vectors that are concatenated with original features for the next level.
Figure 1 illustrates that the class distribution of each random forest is averaged, and the class with the highest probability is selected as the prediction.
In the cascade forest, each level contains two completely random forests (500 trees each, built by randomly selecting samples and features until leaf nodes contain a single class) and two standard random forests (500 trees each, using √d features per split). Each forest outputs a K‑dimensional class‑probability vector; the four forests together produce 4K vectors that are concatenated with the original feature vector and fed to the next level. K‑fold cross‑validation (averaging K‑1 folds) reduces over‑fitting, and the cascade depth stops automatically when validation performance no longer improves.
Figure 2 shows how class‑probability vectors are generated by averaging the leaf‑node distributions of all trees in a forest.
Multi‑grained scanning slides windows of various sizes over the raw features to create enriched, locally‑structured representations. For a 400‑dimensional input and a window size of 100, 301 sliding instances are produced; each instance is classified by a random forest, yielding a 3‑dimensional probability vector. Concatenating the vectors from all instances results in a high‑dimensional feature (e.g., 301 × 3 = 906 dimensions) that captures multi‑scale local patterns.
Figure 3 depicts the multi‑grained scanning process for 1‑D vectors (top) and 2‑D image‑like features (bottom).
Figure 4 presents the overall gcForest pipeline: raw 400‑dimensional features are first transformed by multi‑grained scanning into three feature sets (1806, 1206, and 606 dimensions). These are fed into an N‑level cascade forest, producing a 12‑dimensional vector that is concatenated with the original 606‑dimensional features for the final prediction.
The complete workflow integrates the modules as described, using four forests per cascade level and three sliding‑window scales. The cascade proceeds level by level, concatenating intermediate outputs with the original multi‑grained features, and finally averaging the class‑probability outputs of all forests.
Experimental results (using 4 completely random forests and 4 standard random forests, each with 500 trees, and 3‑fold cross‑validation) show that gcForest achieves performance comparable to or better than deep neural networks on the tested datasets, especially when employing sliding windows of sizes d/16, d/8, and d/4.
In summary, gcForest adopts the hierarchical representation learning idea of deep learning but replaces neural layers with ensembles of decision trees, enhanced by multi‑grained scanning to generate richer features, making it particularly suitable for small‑sample scenarios.
Potential improvements include weighting predictions from intermediate cascade layers (similar to using hierarchical features in deep networks) and experimenting with different tree‑based models (Random Forest, GBDT, XGBoost) depending on computational resources and performance requirements.
References: [1] Zhihua Zhou and Ji Feng, “Deep Forest: Towards An Alternative to Deep Neural Networks,” IJCAI 2017. https://arxiv.org/abs/1702.08835v2 (V2) [2] https://arxiv.org/pdf/1702.08835v1.pdf (V1) [3] GitHub: https://github.com/kingfengji/gcForest
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.