Fundamental Concepts and Best Practices for Database Design
This article presents a comprehensive guide to database design, covering the relationship between source documents and entities, primary and foreign keys, table properties, normalization standards, handling many‑to‑many relationships, key selection, data redundancy, ER diagram considerations, view usage, intermediate and temporary tables, integrity constraints, the "Three‑Few" principle, and practical techniques for improving database performance.
1. Relationship Between Original Documents and Entities
Original documents may map to entities in one‑to‑one, one‑to‑many, or many‑to‑one relationships; understanding this mapping helps design effective data‑entry interfaces.
Example: An employee’s resume may correspond to three basic tables—personal info, social relations, and work history—illustrating a one‑document‑to‑multiple‑entities scenario.
2. Primary Keys and Foreign Keys
Every entity should have a primary key (PK) and, when it participates in a hierarchy, a foreign key (FK). In ER diagrams, leaf entities can omit a PK but must have an FK to link to their parent.
The pairing of PKs and FKs is the backbone of relational integrity, representing connections between entities.
3. Characteristics of Base Tables
Base tables differ from intermediate or temporary tables by possessing four key properties:
Atomicity: Fields cannot be further decomposed.
Originality: Records store raw, foundational data.
Derivability: Data in base tables can be combined with code tables to generate all output.
Stability: Structure is relatively stable and records are retained long‑term.
4. Normalization Standards
Designs should aim for the Third Normal Form (3NF) to eliminate redundancy, but practical performance considerations sometimes justify controlled denormalization (adding redundant fields) to trade space for speed.
Example: Adding a calculated "Amount" column (price × quantity) to a product table speeds up reporting despite violating 3NF.
5. Plain‑Language Explanation of the Three Normal Forms
1NF: Enforces atomic attributes.
2NF: Ensures each record has a unique identifier.
3NF: Removes derived (redundant) fields.
While a fully normalized design reduces redundancy, selective denormalization can improve query performance.
6. Handling Many‑to‑Many Relationships
Many‑to‑many relationships should be resolved by introducing a junction (third) entity, converting the relationship into two one‑to‑many links and assigning appropriate foreign keys.
Example: In a library system, a "BorrowReturn" table links books and readers, storing borrow/return timestamps and foreign keys to both entities.
7. Choosing Primary Key Values
PKs can be surrogate numeric identifiers generated automatically or meaningful business fields; surrogate keys are generally preferred, and composite keys should be kept minimal.
8. Recognizing Data Redundancy
Redundancy refers to repeated non‑key fields; derived fields (e.g., "Amount" from price and quantity) are considered high‑level redundancy that can be intentional for performance.
9. No Single Correct ER Diagram
ER diagrams have no unique answer; a good diagram is clear, concise, appropriately scoped, and free of low‑level redundancy.
10. Usefulness of Views
Views are virtual tables built on base tables, providing abstraction, security, and performance benefits; view depth should generally not exceed three layers.
11. Intermediate, Report, and Temporary Tables
Intermediate tables store aggregated data for reporting or data‑warehousing and may lack PK/FK constraints; temporary tables are session‑specific and managed by developers.
12. Integrity Constraints
Domain integrity – enforced with CHECK constraints.
Referential integrity – enforced with PK/FK and triggers.
User‑defined integrity – enforced with stored procedures and triggers.
13. The "Three‑Few" Principle to Avoid Patchwork Designs
Minimize the number of tables.
Minimize the number of columns in composite primary keys.
Minimize the total number of columns per table.
Applying these rules encourages a concise, well‑integrated data model.
14. Techniques for Improving Database Performance
Denormalize strategically and prefer stored procedures over triggers.
Offload massive computations to external programs (e.g., C++) before loading results.
Horizontally partition tables with >10 million rows; vertically split tables with >80 columns.
Tune DBMS parameters (buffer pools, etc.) and apply SQL best‑practice optimizations.
Implement efficient algorithms in application code.
Overall, performance gains require simultaneous optimization at the system, design, and implementation levels.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.