IT Services Circle
Mar 21, 2022 · Big Data
Understanding Spark Shuffle: Hash, Sort, and Tungsten Sort Mechanisms
This article explains the evolution and inner workings of Spark's shuffle phase, comparing the original Hash‑based shuffle, the default Sort‑based shuffle, the optimized Tungsten‑Sort shuffle, and related configuration options that affect performance and file handling in large‑scale data processing.
Hash ShuffleShuffleSort Shuffle
0 likes · 17 min read