Big Data 5 min read

What’s New in Hadoop 3.0? Key Features and Improvements Explained

Hadoop 3.0, built on JDK 1.8, adds erasure‑coded HDFS, multi‑NameNode support, native MapReduce task optimizations, cgroup‑based YARN memory and disk isolation, and container resizing, with an alpha slated for summer and a GA release expected in November or December.

Hulu Beijing
Hulu Beijing
Hulu Beijing
What’s New in Hadoop 3.0? Key Features and Improvements Explained

1. Hadoop 3.0 Overview

Hadoop 2.0 was built on JDK 1.7, which stopped updating in April 2015, prompting the community to release a new version based on JDK 1.8—Hadoop 3.0.

The Hadoop 3.0 alpha is expected to be released in summer, with the GA version slated for November or December.

Hadoop 3.0 introduces important features and optimizations, including erasure‑coded HDFS, multiple NameNode support, MR native task optimizations, YARN memory and disk I/O isolation based on cgroup, and YARN container resizing.

2. New Features in Hadoop 3.0

2.1 Hadoop Common

Streamlined core by removing deprecated APIs and implementations, replacing default components with more efficient versions (e.g., FileOutputCommitter v2, webhdfs replacing hftp, removal of org.apache.hadoop.Records).

Classpath isolation to prevent jar conflicts between Hadoop, HBase, Spark, etc.

Shell script refactoring, fixing many bugs and adding dynamic command support.

2.2 Hadoop HDFS

Support for erasure coding, reducing storage by half without compromising reliability.

Multiple NameNode support, allowing one active and multiple standby NameNodes in a cluster (multi‑ResourceManager already supported in Hadoop 2.0).

2.3 Hadoop MapReduce

Task native optimization: C/C++ map output collector implementation (Spill, Sort, IFile) improves shuffle‑intensive job performance by about 30%.

Automatic inference of memory parameters, simplifying configuration of mapreduce.{map,reduce}.memory.mb and mapreduce.{map,reduce}.java.opts.

2.4 Hadoop YARN

cgroup‑based memory and disk I/O isolation.

Curator‑based ResourceManager leader election.

Container resizing support.

Next‑generation TimelineServer.

3. Hadoop 3.0 Summary

Hadoop 3.0’s alpha is expected this summer, with GA in November or December, bringing major enhancements such as erasure‑coded HDFS, multi‑NameNode, native MapReduce tasks, and improved YARN isolation and container management.

big dataMapReduceYARNHDFSHadoopVersion 3.0
Hulu Beijing
Written by

Hulu Beijing

Follow Hulu's official WeChat account for the latest company updates and recruitment information.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.