How to Secure Data: Practical Guide to Masking and Encryption Strategies
This comprehensive guide explains why modern enterprises must shift from network‑centric protection to data‑centric security, detailing practical approaches for data masking and storage encryption, evaluating regulatory requirements, outlining solution selection, and providing step‑by‑step implementation practices to safeguard sensitive information.
Background
With the rise of the data era, many countries and enterprises face increasing data‑leak incidents, such as a 2022 automobile data breach. Attackers have shifted from service disruption to information theft, prompting a transition from network/endpoint‑centric protection to data‑centric security. China enacted the Data Security Law and Personal Information Protection Law in 2021, marking a new era of information security.
Data Security Protection Measures
Effective data security requires protecting data throughout its lifecycle—collection, generation, usage, transmission, storage, sharing, disclosure, and destruction. This article focuses on two key practices: data‑storage encryption (for structured data) and data‑display masking.
2.1 Data Display Masking
What is data masking?
Data masking transforms or hides sensitive data according to defined rules so that the processed data cannot directly reveal original sensitive attributes. Example: a full 11‑digit phone number may be displayed as 138****413 , replacing the middle digits with asterisks.
What problems does masking solve?
When sensitive data is displayed on web pages, apps, or mini‑programs, it can be exposed through shoulder surfing or arbitrary copying. For instance, an employee directory containing phone numbers and personal details could be scraped in bulk, leading to massive leaks if no masking is applied.
2.2 Data Storage Encryption
What is data encryption?
Historically, “data security” was equated with “storage security,” treating data as a static resource. Modern understanding treats data as dynamic, requiring protection at its source. Encryption prevents attackers who obtain database files from reading sensitive fields.
Security Solution Research and Selection
Solution selection is critical and involves multiple stakeholders (architecture, security, DBA, CI/CD). The process includes defining sensitive data, referencing regulations, and evaluating technical options.
3.1 Sensitive Data Definition
Business data: personal sensitive information, UGC, and internally defined confidential data.
Basic support data: passwords, encryption keys, etc.
Reference standards such as the Financial Data Security Classification Guide (JR/T 0197‑2020) and Telecom Data Classification Guide (YD/T 3813‑2020) are used to classify sensitive data.
3.1.1 Relevant Domestic Laws
The Personal Information Protection Law and Data Security Law define personal information (e.g., name, birthday, ID number, biometric data, address, phone number) and set protection requirements.
3.1.2 Comprehensive Considerations
Combining legal requirements with industry best practices leads to a classification that includes business data and basic support data.
3.1.3 Sensitive Data Tagging
After defining sensitive data, a scanning and tagging process is performed. Scanning methods include sequential, random, and reverse extraction from databases or data warehouses. Tagging is often rule‑based (e.g., regex, field‑name matching) and may require manual verification.
3.2 Data Masking Solution Research
Four main masking approaches are evaluated:
API‑level masking
Database view masking
Front‑end (JS) masking
Anonymization (de‑identification)
API‑Level Masking
Standardized, supports various data types, but requires comprehensive configuration to avoid omissions.
Database View Masking
Dynamic masking via views; easy to maintain but can incur performance overhead for complex logic.
Front‑End JS Masking
Low implementation cost but provides only pseudo‑masking, which can be reversed by attackers.
Anonymization
Simple to implement, but reduces data utility and is suitable mainly for data sharing or publishing.
Optimal Masking Choice
Considering source‑level protection and the need for occasional plaintext access, API‑level masking is selected as the optimal solution.
3.3 Data Encryption Research
3.3.1 Encryption Algorithm Selection
Key considerations include international vs. domestic (GuoMi) algorithms, symmetric vs. asymmetric encryption, and choosing an algorithm with appropriate strength versus performance.
3.3.2 Key Management
Whether to add randomness to key generation.
Key rotation mechanisms.
Trusted KMS storage.
Hierarchical key structures (root, KEK, data keys).
Dynamic conversion and expiration handling.
Reference: WKMS micro‑merchant key management system.
3.3.3 Encryption Deployment Options
Five technical options are compared:
CASB proxy gateway (deployed between client and application server).
Application‑level encryption via SDK (integrated into the service code).
Database encryption gateway (proxy between app server and DB).
Database plug‑in encryption (e.g., Oracle‑specific).
Transparent Data Encryption (TDE) at the DB layer.
Each option’s advantages and disadvantages are discussed, with the final choice being application‑level encryption for flexibility.
3.3.4 Impact of Encryption
Encryption can affect data retrieval, sorting, and downstream synchronization. Mitigation strategies include adding auxiliary searchable fields and maintaining both plaintext and ciphertext during migration.
4. Practical Implementation Plans
4.1 Data Masking Implementation Steps
4.1.1 Masking Legacy Data
Coordinate with product and development teams to inventory affected modules, prioritize based on release schedule and business impact, and perform phased integration.
4.1.2 Masking Incremental Data
Pre‑stage: embed data‑security review in project approval. In‑stage: audit API definitions for sensitive fields and enforce masking rules via CI/CD. Post‑stage: monitor API traffic for sensitive data exposure and remediate gaps.
4.2 Data Encryption Implementation Steps
4.2.1 Encrypting Legacy Data
Follow a multi‑phase process: add encrypted columns, ensure length compatibility, double‑write data, verify via gray‑release, batch‑process existing records, monitor DB traffic, and finally deprecate plaintext columns.
4.2.2 Encrypting Incremental Data
Pre‑stage: include encryption requirements in security reviews. In‑stage: audit DDL statements (CREATE/ALTER) for sensitive columns and enforce encryption via CI/CD checks. Post‑stage: regularly scan databases for newly added sensitive fields and enforce encryption.
5. Conclusion
Data security is a long‑term, complex undertaking involving many business lines, high implementation difficulty, and extensive integration. Masking and encryption are only two parts of the lifecycle; continuous improvement and additional safeguards are required to protect user privacy.
6. References
[1] Green Alliance, “Privacy Attack and Defense in Big Data: Effective Masking of ID Numbers and Phone Numbers.”
[2] Lian Shi Network CG, “Understanding Ten Data‑Storage Encryption Techniques.”
[3] JR/T 0197‑2020, “Financial Data Security Classification Guide.”
[4] YD/T 3813‑2020, “Basic Telecom Enterprise Data Classification Method.”
[5] Eleven Liu, “Dynamic Web‑Query Sensitive Data Masking via Database Views.”
Weimob Technology Center
Official platform of the Weimob Technology Center
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.