Fundamentals 11 min read

Overview of IBM GPFS (General Parallel File System) Architecture and Features

The article provides a comprehensive overview of IBM's General Parallel File System (GPFS), detailing its physical and logical architecture, key components such as NSD and quorum mechanisms, scalability, load balancing, and fault‑tolerance features for parallel and serial applications.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
Overview of IBM GPFS (General Parallel File System) Architecture and Features

GPFS (General Parallel File System) is IBM's first shared file system; it is a parallel disk file system that allows all nodes in a resource group to access the entire file system concurrently, supporting both parallel and serial applications through a unified naming interface.

GPFS File System Architecture

The physical architecture consists of three layers: the storage layer, the GPFS server layer, and the client layer. Storage devices are presented to server nodes via a SAN environment; the servers format these disks into the GPFS parallel file system format, and clients connect over Ethernet using a private file‑system protocol to achieve concurrent I/O.

At the lowest level are the physical disks; the SAN allocates LUNs to NSD (Network Shared Disk) servers, which format them as NSD disks. These NSD disks are then assembled by the GPFS service layer into a GPFS file system that is shared with all clients via the private protocol.

Alternatively, the service and client layers can be merged so that each NSD server acts both as a service provider and a client, resulting in a flatter architecture with shallower I/O depth and better performance, albeit requiring more SAN resources.

GPFS Logical Architecture

Parallel read/write is realized by the core daemon mmfsd , which spawns subprocesses to manage configuration, the file system, and inode information. Applications perform standard OS file I/O calls, which the operating system maps to GPFS-managed inodes.

Each GPFS client maintains a local inode map, while GPFS maintains a global inode map, ensuring consistency across the cluster without requiring any special application modifications.

Logical Objects in GPFS

Network Shared Disk (NSD) : Storage devices allocated to servers and presented as GPFS‑managed physical disks, shared among all cluster nodes.

GPFS File System : Created on top of NSD disks, containing metadata and address‑space tables like any other file system.

Service Cluster Nodes and Client Nodes : Nodes that provide GPFS services; software installation is identical for service and client nodes, with roles defined during cluster configuration.

Quorum Node and Tiebreaker Disk : Resources used to determine cluster health when communication failures occur; either a node or a disk can serve as the arbitration resource.

GPFS Cluster Quorum Mechanism

Data integrity is ensured by safety mechanisms and an availability‑judgement system that offers three quorum types: the built‑in File Descriptor Quorum (non‑configurable), Node Quorum, and Tiebreaker Disk Quorum. Only one of the latter two can be selected based on the environment and reliability analysis.

Node Quorum Mechanism

Multiple nodes are designated as quorum nodes; if a majority are online, the cluster is considered healthy, otherwise the file system is shut down. Up to 128 nodes are supported, following the 2N+1 rule for fault tolerance.

Tiebreaker Disk Mechanism

Specific disks are monitored as tiebreaker resources; if a majority of them remain online, the cluster stays operational, otherwise it is shut down. A maximum of two tiebreaker disks can be configured.

For example, with five service nodes, two can be chosen as quorum hosts and 2N+1 disks as tiebreakers, allowing the system to tolerate N failed disks and one failed node. Node quorum is preferable for larger clusters, while disk quorum suits smaller ones.

GPFS Failure Group

A Failure Group consists of network‑shared disks that share the same physical path; groups can replicate data or logs to ensure that a single physical disk failure does not cause data loss.

GPFS File System Scalability

GPFS clusters support online addition or removal of nodes without impacting running tasks, and the file system itself can be expanded or shrunk online without affecting concurrent workloads.

GPFS File System Load Balancing

The design aims to distribute I/O load evenly across service nodes. Each NSD disk can be configured with an ordered list of service nodes; the order determines which node handles I/O for that NSD, thereby balancing the load.

For instance, NSD1 and NSD2 may be ordered as "node1, node2, node3, node4", while NSD3 and NSD4 may be ordered as "node4, node3, node2, node1"; client I/O is directed to the first available node in each NSD's list, achieving balanced traffic.

Content sourced from the “AIX Expert Club” public account, with modifications.

Warm Tip:

Please search for “ICT_Architect” or scan the QR code below to follow the public account for more content.

scalabilityHigh Availabilitystorage architectureparallel file systemGPFS
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.