Big Data 11 min read

Software Localization and the Future of Big Data Platforms in China

The article examines why software localization is essential for China’s data technology, outlines the challenges and current state of domestic operating systems, databases and big‑data platforms, discusses migration and upgrade strategies, and introduces NetEase DataFun’s self‑developed big‑data platform with its features and support.

DataFunTalk
DataFunTalk
DataFunTalk
Software Localization and the Future of Big Data Platforms in China

Long‑term reliance on foreign data‑technology vendors has driven China to pursue software localization, especially after incidents such as Cloudera ending CDH support and the Log4j vulnerability, which highlighted the need for autonomous, secure core technologies.

The necessity for localization stems from external pressures, security concerns, and government policies encouraging domestic control over IT systems, with examples ranging from domestic CPUs to databases like OceanBase, TiDB, and PostgreSQL forks.

Challenges include defining what constitutes true localization, mastering source code, and building an ecosystem that can modify and enhance software while maintaining compatibility with existing tools.

Current domestic developments show progress in operating systems (e.g., Kylin, UOS, OpenEuler), databases (OceanBase, TiDB, GaussDB), and enterprise software (WPS, Yonyou), yet adoption in critical sectors like finance remains cautious due to stability requirements.

Big‑data platforms are shifting from free, open‑source models to paid, subscription‑based offerings; migration paths include in‑place upgrades, migration upgrades, and rolling upgrades, each with trade‑offs in downtime, resource needs, and risk.

NetEase DataFun’s big‑data platform, built on a decade of experience, integrates Hadoop, Spark, Impala, and other components, adds security features (Kerberos, Ranger), supports Chinese‑specific hardware (Kunpeng, Kirin), and offers migration tools for CDH, HDP, and other ecosystems.

The platform provides comprehensive services: technical onboarding, operational support, rapid issue resolution, and migration assistance, including tools for Hive metadata and Oozie workflow migration.

A Q&A session addressed topics such as CDH migration considerations, security requirements for financial systems, the value of data middle‑platforms, differences between CDH+Cloud Manager and HDP+Ambari, upgrade recommendations, and future plans for Kubernetes integration and automated operations.

big dataopen sourceSecurityChinaplatform migrationsoftware localization
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.