Fundamentals 23 min read

Understanding NFS File Handles, Export Operations, and Their Use in OverlayFS and Syscalls

This article explains how NFS represents files with file handles, details the generation and decoding mechanisms via export_operations, examines the exportfs API implementation, and explores practical applications such as overlayfs integration and the name_to_handle_at/open_by_handle_at syscalls, providing code examples and kernel‑level insights.

Coolpad Technology Team

May 31, 2022

Understanding NFS File Handles, Export Operations, and Their Use in OverlayFS and Syscalls

1. What does NFS use to represent a file?

NFS clients tell the server which file to operate on by sending a file handle (fh) instead of a path, avoiding issues with path changes and long PATH_MAX strings.

NFS exported directories may be renamed or replaced locally; a path‑based request would then point to the wrong object, while a file handle uniquely identifies the inode and its generation.

Linux defines PATH_MAX as 4096, making long path transmission wasteful.

To solve these problems NFS introduces the file handle concept, which by default contains the inode number (ino) and inode generation (igeneration).

The inode number is unique within a filesystem but can be reused; the generation distinguishes a reused inode from the original.

This makes the file handle a fixed‑size, one‑to‑one identifier for a file.

2. How is a file handle generated?

The local filesystem stores the inode number and generation; some filesystems add extra information (e.g., btrfs adds a root inode).

Export operations (export_operations) implement the generation and parsing of file handles; the next sections describe the relevant interfaces.

Convenient APIs are provided in fs/exportfs/exp.c that wrap export_operations.

3. What capabilities does export_operations provide?

To support NFS daemons, a filesystem must implement its own export_operations. The relevant interfaces are:

3.1 encode_fh

int (*encode_fh)(struct inode *inode, __u32 *fh,
    int *max_len, struct inode *parent);

encode_fh stores inode and parent information in the fid and returns a fid_type describing the encoding.

Filesystems can implement their own encode_fh or use the helper export_encode_fh(), which stores inode number and generation (and optionally parent info) and returns FILEID_INO32_GEN_PARENT or FILEID_INO32_GEN.

Each encoding type is declared in enum fid_type, e.g., btrfs defines FILEID_BTRFS_WITH_PARENT_ROOT = 0x4f.

3.2 fh_to_dentry / fh_to_parent

struct dentry *(*fh_to_dentry)(struct super_block *sb, struct fid *fid,
    int fh_len, int fh_type);

struct dentry *(*fh_to_parent)(struct super_block *sb, struct fid *fid,
    int fh_len, int fh_type);

fh_to_dentry resolves a fid to a dentry; fh_to_parent resolves the stored parent information. The decode path uses these functions together with get_name/get_parent.

3.3 get_name / get_parent

int (*get_name)(struct dentry *parent, char *name, struct dentry *child);

struct dentry *(*get_parent)(struct dentry *child);

get_name retrieves the child name from its parent; get_parent returns the parent dentry of a child. If a filesystem does not provide get_name, exportfs walks the parent directory to find the matching inode.

3.4 Other operations

Operations such as commit_metadata and get_uuid are unrelated to file‑handle generation and are omitted here.

4. Exportfs API analysis

The exportfs layer in fs/exportfs/expfs.c wraps export_operations with simple encode/decode helpers.

4.1 Encode part

extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
    int *max_len, struct inode *parent);

extern int exportfs_encode_fh(struct dentry *dentry, struct fid *fid,
    int *max_len, int connectable);

exportfs_encode_inode_fh

encodes an inode (and optionally its parent) using either the filesystem's encode_fh or the default helper. exportfs_encode_fh works with a dentry and can include parent information when connectable is non‑zero (used by NFS subtree checking).

4.2 Decode part

extern struct dentry *exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid,
    int fh_len, int fileid_type,
    int (*acceptable)(void *, struct dentry *), void *context);

extern struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
    int fh_len, int fileid_type,
    int (*acceptable)(void *, struct dentry *), void *context);

exportfs_decode_fh_raw

returns detailed error codes; exportfs_decode_fh maps most errors to STALE for simplicity.

4.2.1 Concept: disconnected dentry

A dentry created without a full path may be "disconnected" from the root dentry tree; such dentries are freed when their refcount drops to zero.

4.2.2 Concept: dentry alias

Hard‑linked files share the same inode but have multiple dentries linked via i_dentry.

4.2.3 Function: exportfs_decode_fh

struct dentry *exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid,
    int fh_len, int fileid_type,
    int (*acceptable)(void *, struct dentry *), void *context)
{
    const struct export_operations *nop = mnt->mnt_sb->s_export_op;
    struct dentry *result, *alias;
    char nbuf[NAME_MAX+1];
    int err;
    /* 1.1 Get dentry from fid (usually disconnected) */
    result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
    if (IS_ERR_OR_NULL(result))
        return result;
    if (!acceptable)
        return result;
    if (d_is_dir(result)) {
        if (result->d_flags & DCACHE_DISCONNECTED) {
            err = reconnect_path(mnt, result, nbuf);
            if (err)
                goto err_result;
        }
        if (!acceptable(context, result)) {
            err = -EACCES;
            goto err_result;
        }
        return result;
    } else {
        struct dentry *target_dir, *nresult;
        alias = find_acceptable_alias(result, acceptable, context);
        if (alias)
            return alias;
        if (!nop->fh_to_parent)
            goto err_result;
        target_dir = nop->fh_to_parent(mnt->mnt_sb, fid, fh_len, fileid_type);
        if (!target_dir)
            goto err_result;
        err = reconnect_path(mnt, target_dir, nbuf);
        if (err) {
            dput(target_dir);
            goto err_result;
        }
        err = exportfs_get_name(mnt, target_dir, nbuf, result);
        if (err) {
            dput(target_dir);
            goto err_result;
        }
        inode_lock(target_dir->d_inode);
        nresult = lookup_one_len(nbuf, target_dir, strlen(nbuf));
        if (!IS_ERR(nresult) && nresult->d_inode != result->d_inode) {
            dput(nresult);
            nresult = ERR_PTR(-ESTALE);
        }
        inode_unlock(target_dir->d_inode);
        dput(target_dir);
        if (IS_ERR(nresult)) {
            err = PTR_ERR(nresult);
            goto err_result;
        }
        dput(result);
        result = nresult;
        alias = find_acceptable_alias(result, acceptable, context);
        if (!alias) {
            err = -EACCES;
            goto err_result;
        }
        return alias;
    }
err_result:
    dput(result);
    return ERR_PTR(err);
}
EXPORT_SYMBOL_GPL(exportfs_decode_fh_raw);

4.2.4 Function: reconnect_path

static int reconnect_path(struct vfsmount *mnt, struct dentry *target_dir, char *nbuf)
{
    struct dentry *dentry, *parent;
    dentry = dget(target_dir);
    while (dentry->d_flags & DCACHE_DISCONNECTED) {
        BUG_ON(dentry == mnt->mnt_sb->s_root);
        if (IS_ROOT(dentry))
            parent = reconnect_one(mnt, dentry, nbuf);
        else
            parent = dget_parent(dentry);
        if (!parent)
            break;
        dput(dentry);
        if (IS_ERR(parent))
            return PTR_ERR(parent);
        dentry = parent;
    }
    dput(dentry);
    clear_disconnected(target_dir);
    return 0;
}

4.2.5 Function: reconnect_one

static struct dentry *reconnect_one(struct vfsmount *mnt,
        struct dentry *dentry, char *nbuf)
{
    struct dentry *parent;
    struct dentry *tmp;
    int err;
    parent = ERR_PTR(-EACCES);
    inode_lock(dentry->d_inode);
    if (mnt->mnt_sb->s_export_op->get_parent)
        parent = mnt->mnt_sb->s_export_op->get_parent(dentry);
    inode_unlock(dentry->d_inode);
    if (IS_ERR(parent)) {
        dprintk("%s: get_parent of %ld failed, err %d
",
            __func__, dentry->d_inode->i_ino, PTR_ERR(parent));
        return parent;
    }
    err = exportfs_get_name(mnt, parent, nbuf, dentry);
    if (err == -ENOENT)
        goto out_reconnected;
    if (err)
        goto out_err;
    tmp = lookup_one_len_unlocked(nbuf, parent, strlen(nbuf));
    if (IS_ERR(tmp)) {
        err = PTR_ERR(tmp);
        goto out_err;
    }
    if (tmp != dentry) {
        dput(tmp);
        goto out_reconnected;
    }
    dput(tmp);
    if (IS_ROOT(dentry)) {
        err = -ESTALE;
        goto out_err;
    }
    return parent;
out_err:
    dput(parent);
    return ERR_PTR(err);
out_reconnected:
    dput(parent);
    if (!dentry_connected(dentry))
        return ERR_PTR(-ESTALE);
    return NULL;
}

5. Other use cases of file handles

5.1 overlayfs

Overlayfs is a union filesystem that merges lower (read‑only) and upper (read‑write) layers. File handles are used to associate files across layers and to resolve hard‑link breakage.

When a lower‑layer file is modified, it is copied up to the upper layer, creating a new inode; the original file handle (including the lower‑layer UUID) is stored so that other hard‑linked files can locate the correct upper‑layer copy.

Overlayfs also maintains an index directory in a workdir, storing file‑handle‑derived filenames for fast lookup of upper‑layer copies.

5.2 syscalls & ioctl

The kernel file fs/fhandle.c provides two syscalls: name_to_handle_at (returns a fid) and open_by_handle_at (opens a file given a fid). They split the traditional openat functionality, enabling remote filesystems to transmit a fid and later open the file on demand.

These mechanisms are used by XFS for backup, restore, and hierarchical storage management, exposing ioctl commands such as XFS_IOC_FD_TO_HANDLE_32, XFS_IOC_PATH_TO_HANDLE_32, XFS_IOC_OPEN_BY_HANDLE_32, etc.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kernel NFS overlayfs exportfs file handle syscalls

Written by

Coolpad Technology Team

Committed to advancing technology and supporting innovators. The Coolpad Technology Team regularly shares forward‑looking insights, product updates, and tech news. Tech experts are welcome to join; everyone is invited to follow us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.