Fundamentals 15 min read

Mach-O File Format: Dynamic and Static Library Attribution and API Scanning Solutions

This article introduces the Mach-O executable format, explains how its structure can be leveraged to attribute dynamic and static libraries at runtime and during build, and presents two practical projects—library attribution and fast API scanning—complete with implementation details and code snippets.

HomeTech
HomeTech
HomeTech
Mach-O File Format: Dynamic and Static Library Attribution and API Scanning Solutions

Mach-O (Mach Object) is the executable file format used on macOS, iOS and iPadOS, analogous to Windows PE. It supports CPU architectures such as x86_64, armv7 and arm64. Understanding its structure and loading process helps developers analyze app startup, function hooking, lazy loading of dynamic libraries, and other runtime behaviors.

The article outlines five typical use cases: symbolizing Crash, API hooking, lazy loading, optimizing app launch speed, reducing app binary size, and method‑call‑chain analysis. It then focuses on two concrete practice projects.

Practice Project 1 – Dynamic and Static Library Attribution

Background: modern apps contain many dynamic and static libraries, and performance data collected from the app must be routed to the responsible developers. By parsing Mach‑O structures and runtime information, a lightweight attribution scheme is built.

Dynamic library attribution uses the object's isa pointer to locate its class, then finds the owning binary via NSBundle *bundle = [NSBundle bundleForClass:objClass]; . Stack traces are resolved with Dl_info info; dladdr((void *)address, &info); NSString *dliFname = [NSString stringWithFormat:@"%s", info.dli_fname]; to identify the dynamic library.

Static library attribution traditionally relies on dSYM symbol files, which incurs storage cost and requires three‑way interaction. The proposed solution parses Mach‑O headers, load commands, and sections ( __TEXT , __objc_classlist ) to build a class‑address map and a text‑address map during compilation using the Mach‑O file and the Link Map. At runtime, the isa pointer offset (adjusted for ASLR) is matched against the class map, and stack address offsets are matched against the text map, achieving attribution in less than 1 ms.

Key code snippets:

NSBundle *bundle = [NSBundle bundleForClass:objClass];
Dl_info info = {0}; dladdr((void *)callStackAddress, &info); NSString *dliFname = [NSString stringWithFormat:@"%s", info.dli_fname];
mach_header_64 mhHeader; [fileData getBytes:&mhHeader range:NSMakeRange(0, sizeof(mach_header_64))]; for (int i = 0; i < mhHeader.ncmds; i++) { /* parse load commands */ }

The generated ClassMap and TextMap are embedded into the IPA (<1 KB) with negligible impact on app size.

Practice Project 2 – Fast API Scanning Based on Mach‑O

Background: traditional syntax‑tree scanning cannot handle black‑box SDKs and is slow (minutes for tens of thousands of lines). The new approach scans the Mach‑O binary directly.

The process parses __objc_classrefs to collect external class references, __objc_selrefs for method references, and disassembles code sections using Capstone to locate bl (branch‑with‑link) instructions. By comparing the target address of bl with the method address stored in method_addr , the tool identifies API call sites.

During scanning, the tool also handles address reconstruction for ARM64 where high and low parts of an address are split across instructions, e.g., combining x8 with a low‑byte offset to compute the full address before comparison.

Key code snippets:

mach_header_64 mhHeader; [fileData getBytes:&mhHeader range:NSMakeRange(0, sizeof(mach_header_64))];
csh cs_handle = 0; cs_err err = cs_open(CS_ARCH_ARM64, CS_MODE_ARM, &cs_handle); cs_option(cs_handle, CS_OPT_DETAIL, CS_OPT_ON); cs_insn *insn = NULL; size_t count = cs_disasm(cs_handle, (const uint8_t *)code, size, address, 0, &insn);
for (int i = 0; i < callStackCount; i++) { uintptr_t classInBundle = (uintptr_t)objClass - mainExecuteAddress; if (classInBundle >= start && classInBundle <= end) { return libraryName; } }

The scanning completes within 2 seconds for 30 k lines of compiled code, offering a 20× speed improvement over syntax‑tree methods and supporting Objective‑C, Swift and C APIs.

Conclusion : By leveraging Mach‑O’s structure, the presented solutions provide low‑overhead, real‑time library attribution and high‑performance API scanning, applicable to any iOS/macOS binary without source code, and pave the way for further extensions such as block or constant attribution, API usage compliance checks, and security hardening.

iOSMach-OReverse EngineeringBinary AnalysisAPI scanningstatic library attribution
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.