Operations 15 min read

Scaling Strategies: Hardware Expansion, AKF Partitioning, and Distributed ID Generation

This article explains why scaling is necessary, outlines hardware and component expansion strategies, introduces the AKF partitioning principle for horizontal and vertical scaling, discusses challenges after splitting, and reviews database clustering and distributed ID generation techniques such as UUID and Snowflake, highlighting their advantages and drawbacks.

Architecture Digest

Nov 9, 2021

Scaling Strategies: Hardware Expansion, AKF Partitioning, and Distributed ID Generation

Why Scale

In plain terms, no matter how much you optimise performance, there is an upper limit; for high‑traffic applications you can optimise servers with rate‑limiting, resource isolation, etc., but the ceiling remains, so you must upgrade hardware—stronger CPUs, more memory—to increase capacity.

Scaling Strategies

Scaling can be divided into two types: whole‑machine scaling (CPU, memory, storage) and component‑level scaling (adding memory, disks, CPUs).

Whole‑Machine Hardware

Whole‑machine scaling benefits from professional server vendors (IBM, Inspur, Dell, HP) that provide well‑matched hardware and stability, similar to buying a pre‑assembled PC versus assembling one yourself.

Component Scaling

Tech‑savvy companies often purchase individual components to reduce cost and customise for workload type—CPU for compute‑intensive, memory for I/O‑intensive, disks for storage‑heavy workloads.

Components include:

CPU : Intel, AMD, frequency, core count, etc.

Network Card : 100 Mbps → 1 Gbps → 10 Gbps.

Memory : ECC verification.

Disk : SCSI HDD, HHD, SATA SSD, PCI‑e SSD, NVMe SSD.

AKF Partitioning Principle

The AKF principle (X‑axis expansion) distributes requests across multiple machines, but data synchronisation becomes harder as the number of machines grows, limiting unlimited replication. Therefore, hot‑spot services are isolated and only those are scaled (Y‑axis partitioning).

After business splitting, a single hot business may still exceed capacity; further horizontal replication can be achieved by deploying the data in multiple regions (e.g., Hubei, Beijing, Shanghai) so users are served by the nearest server.

Problems After Splitting and Scaling

As business grows, systems become large and are split into independent yet interconnected projects (trading, finance, production, logistics, website, etc.). Distributed architectures introduce issues such as data sharing, interface calls, data‑persistence avalanches, high concurrency, and data consistency.

Data Sharing Issue : Synchronising data across services; solutions include data centres or database clusters.

Interface Call Issue : Remote Procedure Call (RPC) protocols like Java RMI and Dubbo.

Persistence Avalanche : Database sharding, resource isolation, Redis persistence strategies (RDB, AOF).

High Concurrency : Cache problems (breakdown, penetration, avalanche) and data‑loop handling across multiple client platforms.

Data Consistency : Ensuring identical data (e.g., price) across servers, often using distributed locks.

Database Scaling: Clustering

Distributed systems differ from clusters: distribution shortens task execution time, while clustering increases the number of operations per unit time. When a single database cannot meet demand, a cluster (master‑master or master‑slave) distributes read/write load across multiple servers.

Distributed ID

Large distributed systems need globally unique identifiers. Auto‑increment IDs are unsuitable due to predictability and scalability limits. Requirements include global uniqueness, monotonic increase, trend increase, and security.

Distributed ID Requirements

Global uniqueness.

Trend‑increasing for better indexing.

Monotonic increase.

Security (non‑sequential to prevent data scraping).

Additional operational requirements: low average and 99.9th‑percentile latency, five‑9 (or six‑9) availability, and high QPS.

Distributed ID Generation Strategies

Common strategies include UUID, Snowflake, Redis, Zookeeper. Below are brief notes on UUID and Snowflake.

UUID Generation Algorithm

UUID (Universally Unique Identifier) is a 128‑bit value represented as 32 hexadecimal digits in 5 groups (8‑4‑4‑4‑12). Example: 550e8400-e29b-41d4-a716-446655440000. There are five UUID versions defined in RFC 4122.

Advantages :

Very high performance: generated locally without network overhead.

Disadvantages :

Large size (16 bytes, 36‑character string) makes storage costly.

Potential security risk: MAC‑address‑based UUIDs can expose hardware information.

Unsuitable as primary keys in databases like MySQL due to length and lack of order, leading to index bloat and performance degradation.

Snowflake Generation Algorithm

Snowflake splits a 64‑bit integer into multiple fields (timestamp, datacenter ID, machine ID, sequence). The 41‑bit timestamp provides up to 69 years of range; 10‑bit machine ID can represent up to 1024 machines (or split into datacenter and worker IDs); 12‑bit sequence allows 4096 IDs per millisecond, yielding theoretical QPS of ~4.1 million.

Advantages:

Timestamp in high bits makes IDs trend‑increasing.

Independent of external services; high stability and performance.

Flexible bit allocation to suit business needs.

Disadvantages:

Strong dependency on system clock; clock rollback can cause duplicate IDs or service outage.

Elastic Scaling

Elastic scaling automatically expands resources according to a schedule and releases them later, addressing predictable peak‑valley demand and improving resource utilisation.

Challenges include weak elasticity of virtual machines (slow provisioning, multi‑department coordination) and high IT cost due to over‑provisioning for peak loads.

---

Java Technology Learning Group

Join the WeChat group for architecture discussions (add "city+position+experience" as a remark).

---

Various promotional links to open‑source projects and articles follow (not part of the technical content).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems elastic scaling database clustering ID generation hardware expansion

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.