Tag

YARN

1 views collected around this technical thread.

iQIYI Technical Product Team
iQIYI Technical Product Team
May 15, 2025 · Big Data

Introducing AMD and ARM Bare‑Metal Instances for iQIYI Big Data Computing: Cloud Selection, Performance Evaluation, and Heterogeneous Scheduling

To reduce costs and boost compute density, iQIYI's big data team migrated from aging private‑cloud Intel servers to public‑cloud AMD and ARM bare‑metal instances, establishing a systematic machine‑selection process, performance testing framework, and YARN‑based heterogeneous scheduling to fully leverage the new hardware.

AMDARMBig Data
0 likes · 16 min read
Introducing AMD and ARM Bare‑Metal Instances for iQIYI Big Data Computing: Cloud Selection, Performance Evaluation, and Heterogeneous Scheduling
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 21, 2025 · Frontend Development

Understanding pnpm: Solving Dependency Management Issues in Modern Frontend Development

This article explains the evolution of JavaScript package managers, the shortcomings of npm and Yarn such as duplicated installations, phantom dependencies and unpredictable dependency trees, and demonstrates how pnpm’s content‑addressable store, hard‑link and symlink strategy provides faster installs, reduced disk usage, and more reliable dependency isolation for frontend projects.

Dependency ManagementFrontend DevelopmentYARN
0 likes · 22 min read
Understanding pnpm: Solving Dependency Management Issues in Modern Frontend Development
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Feb 17, 2025 · Cloud Native

Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator

To reduce resource contention and improve offline task reliability, this article examines the challenges of using Koordinator with Hadoop Yarn pods on Kubernetes, proposes real‑time resource reporting and task‑level eviction strategies, details community and custom solutions, and outlines future enhancements with Volcano.

Big DataKoordinatorKubernetes
0 likes · 9 min read
Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 14, 2025 · Backend Development

Understanding npm, Yarn, and pnpm: Dependency Management, Flat Dependencies, and pnpm's Store Mechanism

This article examines the evolution of JavaScript package managers—from npm's nested node_modules structure to Yarn's flat dependencies and finally pnpm's global store with hard‑ and soft‑link mechanisms—highlighting how each approach addresses path length, disk‑space waste, installation speed, and ghost‑dependency issues.

Dependency ManagementNode.jsYARN
0 likes · 8 min read
Understanding npm, Yarn, and pnpm: Dependency Management, Flat Dependencies, and pnpm's Store Mechanism
360 Smart Cloud
360 Smart Cloud
Jul 9, 2024 · Big Data

Understanding Shuffle in Spark: From Native Shuffle to External and Remote Shuffle Services (Uniffle)

This article examines the critical role of shuffle in big‑data processing, compares Spark's native shuffle with the External Shuffle Service (ESS) and Remote Shuffle Service (RSS) solutions, introduces Uniffle's architecture and configuration, and shares practical deployment experiences and performance results within a 360 internal environment.

Big DataExternal Shuffle ServiceRemote Shuffle Service
0 likes · 15 min read
Understanding Shuffle in Spark: From Native Shuffle to External and Remote Shuffle Services (Uniffle)
Efficient Ops
Efficient Ops
Apr 23, 2024 · Big Data

How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes

This guide walks through planning a three‑node Hadoop 3.3.5 cluster, explains default and custom configuration files, details core‑site, hdfs‑site, yarn‑site, and mapred‑site settings, shows how to distribute configs, start HDFS and YARN, and perform basic file‑system tests.

Big DataCluster SetupHDFS
0 likes · 11 min read
How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 17, 2023 · Big Data

Mixed Workload Co-location of Big Data and Online Services at iQIYI: Design, Implementation, and Results

iQIYI’s mixed‑workload system colocates Spark/Hive big‑data jobs with online video services by running YARN NodeManagers inside Kubernetes, using an Elastic YARN Operator, Koordinator‑driven CPU oversubscription, and remote shuffle, boosting online CPU utilization from ~9 % to over 40 % and saving tens of millions of RMB annually.

Big DataKubernetesMixed Workload
0 likes · 19 min read
Mixed Workload Co-location of Big Data and Online Services at iQIYI: Design, Implementation, and Results
DevOps
DevOps
Jun 7, 2023 · Big Data

Deploying Apache Spark on YARN vs Kubernetes: Architecture, Benefits, and Comparison

This article explains how Apache Spark can be deployed using the traditional Hadoop YARN resource manager and the newer Kubernetes approach, detailing configuration steps, submission methods, and a comprehensive comparison of isolation, scalability, learning curve, logging, performance, and cost considerations.

Big DataDeploymentKubernetes
0 likes · 10 min read
Deploying Apache Spark on YARN vs Kubernetes: Architecture, Benefits, and Comparison
High Availability Architecture
High Availability Architecture
May 26, 2023 · Big Data

Amiya: Dynamic Overcommit Component for Bilibili Offline Big Data Cluster Resource Scheduling

This article introduces Amiya, a self‑developed overcommit component that dynamically increases Yarn memory and vCore capacity on Bilibili's offline big‑data clusters, details its architecture, key implementation of overcommit, eviction and mixed‑deployment strategies, and evaluates its resource‑utilization impact.

Big DataCluster ManagementOvercommit
0 likes · 22 min read
Amiya: Dynamic Overcommit Component for Bilibili Offline Big Data Cluster Resource Scheduling
Bilibili Tech
Bilibili Tech
May 23, 2023 · Big Data

Amiya: Dynamic Overcommit Component for Bilibili Offline Big Data Cluster

Amiya, a self‑developed dynamic over‑commit component for Bilibili’s offline big‑data cluster, inflates reported resources on under‑utilized nodes and adjusts them when load rises, adding roughly 683 TB of memory and 137 k vCores, boosting per‑node memory by 15 % and CPU usage by over 20 % while keeping eviction rates below 3 %.

AmiyaBig DataBilibili
0 likes · 22 min read
Amiya: Dynamic Overcommit Component for Bilibili Offline Big Data Cluster
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 6, 2023 · Backend Development

Monorepo Overview, Evolution, Pros & Cons, Pitfalls, and Tool Selection

This article explains what a monorepo is, traces its evolution from single‑repo monoliths to multi‑repo and back to a single repository with many modules, compares its advantages and disadvantages, lists common pitfalls, and evaluates major tooling options such as Turborepo, Rush, Nx, Lerna, Yarn and pnpm for different project sizes.

LernaMonorepoNx
0 likes · 21 min read
Monorepo Overview, Evolution, Pros & Cons, Pitfalls, and Tool Selection
政采云技术
政采云技术
Apr 18, 2023 · Big Data

Implementing Data Cost Governance: Quantifying Storage and Compute Expenses with Hive, Spark, and HDFS FsImage

This article explains how to perform task‑level data cost governance by collecting storage and compute metrics from Hive tables, Spark jobs, and HDFS FsImage files, then estimating monthly expenses using replication factors and resource‑usage rates, while providing practical SQL and shell examples.

Big DataData Cost GovernanceHDFS
0 likes · 18 min read
Implementing Data Cost Governance: Quantifying Storage and Compute Expenses with Hive, Spark, and HDFS FsImage
TAL Education Technology
TAL Education Technology
Apr 6, 2023 · Backend Development

Summary of npm, Yarn, and pnpm Package Managers

This article reviews the evolution of Node.js package managers—from npm2's nested dependencies to Yarn's flat model, npm3's symlink approach, and pnpm's content‑addressable store—highlighting their installation commands, advantages, drawbacks, and impact on disk usage and dependency management.

Dependency ManagementNode.jsYARN
0 likes · 11 min read
Summary of npm, Yarn, and pnpm Package Managers
ByteFE
ByteFE
Mar 6, 2023 · Frontend Development

Deep Dive into npm, Yarn, and pnpm Dependency Management

This article explains how npm, Yarn, and pnpm manage JavaScript dependencies, detailing installation processes, flat vs nested node_modules structures, lock files, and the hard-link mechanism that improves speed and saves disk space.

Dependency ManagementFrontend DevelopmentYARN
0 likes · 16 min read
Deep Dive into npm, Yarn, and pnpm Dependency Management
TAL Education Technology
TAL Education Technology
Mar 2, 2023 · Backend Development

Exploring pnpm: A High‑Performance Package Manager for Node.js

This article introduces pnpm, compares it with npm and yarn, explains the problems of nested node_modules such as ghost dependencies and split packages, and demonstrates pnpm’s link‑based architecture, advantages, and basic command usage for efficient JavaScript project management.

MonorepoNode.jsYARN
0 likes · 6 min read
Exploring pnpm: A High‑Performance Package Manager for Node.js
ByteFE
ByteFE
Nov 14, 2022 · Frontend Development

Evolution and Innovations of npm, Yarn, and pnpm Package Managers

This article examines the evolution of the three major JavaScript package managers—npm, Yarn, and pnpm—detailing their original designs, the problems they introduced such as nested node_modules, phantom dependencies and doppelgangers, and the innovative solutions like flattening, lock files, symbol/hard links, and PnP mode that each tool brought to improve dependency management.

Package ManagementYARNnode_modules
0 likes · 18 min read
Evolution and Innovations of npm, Yarn, and pnpm Package Managers
Bilibili Tech
Bilibili Tech
Oct 21, 2022 · Big Data

Kyuubi at Bilibili: Architecture, Enhancements, and Production Practices for Large‑Scale Data Processing

Bilibili adopted the open‑source Kyuubi proxy to replace its unstable STS layer, enabling multi‑tenant, multi‑engine (Spark, Presto, Flink) SQL/Scala processing with Hive Thrift compatibility, fine‑grained queue isolation, UI monitoring, stability safeguards, and Kubernetes/YARN deployment, while planning further cloud‑native extensions.

Big DataKubernetesKyuubi
0 likes · 20 min read
Kyuubi at Bilibili: Architecture, Enhancements, and Production Practices for Large‑Scale Data Processing
DataFunSummit
DataFunSummit
Sep 25, 2022 · Big Data

Practical Optimizations and Resource Management of Hadoop YARN at Xiaomi

This article shares Xiaomi's internal practices of Hadoop YARN, covering scheduling and resource optimization, elastic scheduling, node overcommit handling, federation architecture, metadata warehouse construction, and future plans to improve cluster utilization and cost efficiency.

Big DataHadoopPerformance Optimization
0 likes · 20 min read
Practical Optimizations and Resource Management of Hadoop YARN at Xiaomi
Bilibili Tech
Bilibili Tech
Jul 5, 2022 · Big Data

Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili

To overcome rapid data growth and on‑premise capacity limits, Bilibili adopted a scale‑out, unit‑based multi‑datacenter architecture that isolates failures, intelligently places jobs, replicates data via an enhanced DistCp service, routes reads with an IP‑aware HDFS router, and throttles cross‑site traffic, enabling stable offline big‑data processing of hundreds of petabytes while preserving throughput.

Big DataHDFSYARN
0 likes · 28 min read
Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili
DataFunSummit
DataFunSummit
Jul 1, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

Shilong Fei from Xiaomi Data Platform presents an in‑depth exploration of elastic scheduling for Hadoop YARN, covering background, design of resource pools, auto‑scaling architecture, challenges such as job stability and user transparency, achieved cost reductions, and future plans for further optimization.

Big DataHadoopYARN
0 likes · 20 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN