Design and Implementation of a Big Data Permission Management System
This article outlines the background, importance, scenarios, challenges, objectives, and architectural design—including RBAC and ABAC models, metadata integration, data classification, and verification mechanisms—of a comprehensive big data permission management system for secure and fine‑grained data access.
1. Background Introduction
Permission control is one of the most important foundational capabilities of an application system and is usually divided into functional permissions and data permissions. Functional permissions control what actions a user can perform, while data permissions restrict the scope of data objects a user can operate on, including row‑level and field‑level controls such as limiting a user to view only departmental data or only certain fields of a business record.
2. Importance of Data Security
As business evolves, data has become the most important digital asset and core competitive advantage for enterprises, making data security increasingly critical. To strengthen internal data security and management, the data platform team follows the company’s data classification and grading standards and the principle of least privilege, applying permission controls across data analysis, data services, and data development scenarios.
3. Data Permission Scenarios
The need for data permission control originates from real usage scenarios, especially when data needs to be exposed. Current permission‑controlled scenarios mainly include:
Data analysis/query: tools such as the open‑source BI platform Metabase and the in‑house query system "Xiao Cai DA" require table‑ and field‑level permission control.
Data development: the internal data development platform IData needs permission control over source tables used by developers across business lines.
4. Problems Faced
After using Metabase’s built‑in permission features and Apache Ranger for fine‑grained table permissions, several issues emerged:
Metabase and Ranger provide coarse‑grained control and rely on administrators for permission assignment, making the process cumbersome.
Permission management is scattered across multiple systems, increasing maintenance cost and requiring separate development for new applications.
Administrator‑centric permission assignment leads to high workload, especially during personnel or department changes, and raises the risk of over‑privileged access.
Increasing demand for fine‑grained permission management cannot be satisfied by the current setup.
Isolated permission systems cannot integrate with other internal security systems, hindering centralized authorization, multi‑platform usage, real‑time monitoring, and audit capabilities.
5. New Permission System Construction Goals
Based on the above problems, the new permission management system aims to:
Reduce management cost by allowing users to self‑service request data permissions under the principle of least privilege, with an approval workflow involving team leaders, data owners, and the security team.
Centralize permission management and provide unified authentication, enabling one‑time authorization across multiple data platforms and supporting post‑audit and traceability.
Support table‑level, field‑level, row‑level, Metabase dashboard, and report permissions while integrating metadata classification.
6. Permission System Design and Implementation
Permission Model
Two models were investigated: the widely known RBAC (Role‑Based Access Control) and the more flexible ABAC (Attribute‑Based Access Control). RBAC is used for functional permissions, while a customized ABAC‑inspired model is adopted for data permissions.
RBAC
RBAC manages resources by assigning roles to users, simplifying permission management in large organizations. It is employed for functional permission control.
ABAC
ABAC is a highly flexible authorization model suitable for dynamic permission allocation, though more complex to implement. The system adopts ABAC concepts to design a data permission model that fits actual needs.
The data permission model consists of:
Authorization Subject : Users or user groups to which permissions are granted. User groups allow a group administrator to request permissions on behalf of all members.
Authorization Resource : Abstracted entities such as tables, fields, rows, dashboards, and reports.
Authorization Operation : Actions that can be performed on resources, e.g., read or write.
Authorization Environment : Contextual attributes like cluster environment or validity period.
A permission policy combines these four elements; policy evaluation matches a request against stored policies.
Metadata and Data Classification
The permission system relies on data‑warehouse metadata. Data is classified into high, medium, and low security levels, influencing how permissions are granted and how data is presented (e.g., masking or encryption for high‑security fields).
Security levels are assigned by the security team to upstream table fields; downstream fields inherit the highest security level from their upstream sources using a “high‑not‑low” rule.
Table security level is derived from its fields; tables containing any high‑security fields are marked high. High‑security tables require more extensive approval workflows.
After obtaining table permissions, users can view low‑security fields directly, while medium and high‑security fields are masked or encrypted.
Data Permission Verification
The system provides unified permission query and verification APIs for integration with other data‑platform applications. Different big‑data query engines implement verification differently. Metabase uses Presto, while the in‑house Xiao Cai DA uses Presto or StarRocks.
For Presto, a permission‑control plugin validates table and dashboard access, performs field‑level masking/encryption for medium/high security, and filters rows based on policies. StarRocks lacks a plugin framework, so client‑side SQL parsing is used to enforce table‑level permissions.
7. Summary
In summary, the article briefly presents the background, objectives, design, and implementation of the Zhengcai Cloud big‑data permission system.
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.