ByteDance Big Data Platform Security and Permission Governance Practices
This article outlines ByteDance's comprehensive big‑data security framework, detailing current challenges, fine‑grained permission control, asset protection, data deletion capabilities, and the governance principles that balance compliance with operational efficiency.
Introduction The presentation, delivered by ByteDance's Data Platform Product Manager Xu Congyu, introduces the company's big‑data security and permission governance practice, focusing on four main topics: the current state and challenges of the security system, fine‑grained permission control, asset protection, and data deletion capabilities.
1. ByteDance Big Data Security System – Current State and Challenges The security product suite includes data classification and grading, permission management (covering request, row/column permissions, expiration, and reclamation), risk control and audit (user and data risk audits, including behavior grading and abnormal activity detection), asset protection (encryption, decryption, and masking), and data destruction to meet legal compliance.
2. Governance Principles – Ensuring Compliance While Maintaining Efficiency Governance balances external compliance pressure with internal operational pressure, adopting the principle of "ensure compliance + consider efficiency". The approach is illustrated through an upgraded permission model featuring three characteristics and atomic‑level control.
3. Fine‑Grained Permission Control and Governance Key scenarios include column‑level permission requests, table/column permissions with row restrictions, separate control of sensitive tables/columns, flexible authorization mechanisms (resource packages and subjects such as individuals, departments, application accounts, or user groups), and intelligent approval models that assess risk levels (low, medium, high) to automate or require manual review.
4. Redundant Permission Reclamation Redundant permissions are identified via access and authentication logs, with considerations for dual‑authentication systems and whitelist resources. The reclamation process follows a "minimum time" principle, ensuring permissions exist only for the necessary business cycle.
5. Asset Protection Capability Asset protection spans the entire data lifecycle, employing static masking or encryption during integration and on‑demand decryption/masking via API gateways. Encryption solutions include content encryption, transparent file‑format encryption, HDFS encryption, and disk encryption, with a focus on content and file‑format encryption for high consistency and availability.
6. Data Deletion Capability Data deletion addresses privacy compliance by removing personal information within mandated timeframes, using both account‑driven and batch deletion. Challenges include I/O overhead from overwriting, column‑store vs. row‑store inefficiencies, massive data volume impact on disk I/O, network, and compute resources, and potential resource contention. Optimizations involve the Bytelake system (splitting user and non‑user data to accelerate deletion by up to 15×), Spark micro‑batch processing, improved shuffle handling, enhanced HDFS stability, and ACID/MVCC support for consistent reads.
7. Q&A Highlights Answers cover resource package definitions (including row/column permissions), calculation of permission redundancy reduction, and assurance that redundant permission reclamation does not introduce data‑application risks due to dual‑log verification and proactive reminders.
Conclusion ByteDance consolidates these practices into the DataLeap suite, offering a one‑stop data‑mid‑platform solution for external customers to improve governance efficiency and reduce management costs.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.