Backend Development 7 min read

Understanding PHP Array Internals: Buckets, Hash Tables, and Packed Arrays

This article explains the internal implementation of PHP arrays, detailing the _Bucket and _zend_array structures, hash table mechanics, packed versus hash arrays, and how PHP maintains element order through an auxiliary index table.

php中文网 Courses
php中文网 Courses
php中文网 Courses
Understanding PHP Array Internals: Buckets, Hash Tables, and Packed Arrays

The article provides a detailed analysis of PHP array internals, focusing on the underlying C structures such as _Bucket and _zend_array that implement arrays in the Zend Engine.

It explains that PHP arrays rely on two main structs, _Bucket and _zend_array , and that element lookup is achieved via a hash function with O(1) average complexity.

Before the bucket array, an index array is used; many developers encounter pitfalls when interpreting this index array.

An example array $c = array('x'=>1,'y'=>2,'z'=>3,'a'=>0); is shown, and its bucket layout is illustrated with the following image:

When the array is a packed_array , the index array value stays at 2 and does not affect lookup because the packed key is null ; the hash value is simply the element’s position index.

Another illustration shows the bucket structure for $a = array(1,2,3) :

The definition of the Bucket struct is presented as:

<code>typedef struct _Bucket {
    zval              val;   // array value (zval is 16 bytes)
    zend_ulong        h;     // hash of the key
    zend_string      *key;   // used only for hash tables (the actual key string)
} Bucket;</code>

For packed arrays, the hash h equals the element’s zero‑based index, so no hash computation is needed.

The article then examines a sparse packed array $b = array(1=>'a',3=>'b',5=>'c'); , showing its internal layout with the following diagram:

Because the array lacks a 0 index, that slot is marked invalid; the value 'a' is stored as a zval containing a zend_string , which itself includes reference‑counting metadata.

Returning to the original example $c = array('x'=>1,'y'=>2,'z'=>3,'a'=>0); , its bucket structure is shown below:

The large h value is derived from the key using the time33 hash algorithm, forming the basis of the hash table.

A hash table consists of an element array and a hash function; a simple hash can be implemented by taking the modulo of the hash code with the table size (e.g., size 8).

Because direct hashing would produce unordered positions, PHP adds an auxiliary index table (itself an array of integers) that records the position of each element in the element array, preserving insertion order.

When a hash collision occurs (different keys producing the same h ), PHP resolves it using chaining: colliding entries are linked together via a linked list.

In normal conditions val.u2.next is -1 ; upon a collision it points to the previous entry’s position in the chain.

Related articles: PHP8底层内核源码-数组(一) PHP8底层内核源码-数组(二)
data structuresarrayshash tableszend-engine
php中文网 Courses
Written by

php中文网 Courses

php中文网's platform for the latest courses and technical articles, helping PHP learners advance quickly.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.