Databases 10 min read

Understanding MySQL Hash Index: Arrays, Dictionaries, Linked Lists, and Hash Tables

This article explains MySQL's hash index by introducing fundamental data structures such as arrays, dictionaries, linked lists, and hash tables, discussing their advantages and disadvantages, and showing how they are used in MySQL with practical code examples.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding MySQL Hash Index: Arrays, Dictionaries, Linked Lists, and Hash Tables

MySQL uses B+ trees as the default index structure, but it also supports hash indexes, which are built on hash tables; understanding hash tables is essential for grasping MySQL hash indexes.

1. Array – An array is a linear, sequential storage structure indexed by numeric positions. MySQL can store JSON arrays, as shown in the example:

mysql> select @a as "array", json_length(@a) as "array_size";
+-------------------------------------------+------------+
| array                                     | array_size |
+-------------------------------------------+------------+
| [10, 20, 30, 40, 50, 60, 70, 80, 90,100]| 10         |
+-------------------------------------------+------------+
1 row in set (0.00 sec)

Arrays can be multidimensional; a two‑dimensional JSON array example is provided.

mysql> select json_pretty(@a)\G
*************************** 1. row ***************************
json_pretty(@a): [
  [
    "mysql",
    "db2"
  ],
  [
    "oracle",
    "mongodb",
    "sql server",
    "redis"
  ],
  [
    "memcached",
    "dble",
    "postgresql",
    "mariadb"
  ]
]
1 row in set (0.01 sec)

Advantages of arrays: O(1) random read via index. Disadvantages: costly insert/delete due to shifting, and need for contiguous memory.

2. Dictionary – Similar to an array but uses arbitrary string keys. MySQL can store a JSON object as a dictionary:

mysql> set @a='{"10":"mysql","20":"db2","30":"oracle","40":"mongodb"}';
Query OK, 0 rows affected (0.00 sec)

mysql> select json_keys(@a);
+--------------------------+
| json_keys(@a)            |
+--------------------------+
| ["10","20","30","40"] |
+--------------------------+
1 row in set (0.00 sec)

mysql> set @x1=json_extract(@a,'$."10"');
Query OK, 0 rows affected (0.01 sec)
... (similar commands for @x2, @x3, @x4) ...

mysql> select @x1 "dict['10']", @x2 "dict['20']", @x3 "dict['30']", @x4 "dict['40']";
+------------+------------+------------+------------+
| dict['10'] | dict['20'] | dict['30'] | dict['40'] |
+------------+------------+------------+------------+
| "mysql"   | "db2"     | "oracle"  | "mongodb" |
+------------+------------+------------+------------+
1 row in set (0.00 sec)

Advantages of dictionaries: direct key‑value access; disadvantages are similar to arrays regarding memory layout.

3. Linked List – A linear structure where each node stores a value and a pointer to the next node, allowing non‑contiguous storage. MySQL’s B+‑tree leaf nodes are essentially linked lists.

Advantages: no need for contiguous memory and O(1) insert/delete by pointer manipulation. Disadvantage: sequential search required for lookup, leading to O(N) time.

4. Hash Table (散列表) – Stores key‑value pairs using an array indexed by a hash function. The hash function maps a key to an array index; collisions are handled by chaining (linking a list at each bucket) or other methods.

Typical issues:

Large values stored directly in the array can waste memory.

Insertion can be inefficient if values are stored in the array.

Hash collisions occur when different keys produce the same index.

One common solution is the chaining method, where each bucket points to a linked list of entries, combining fast array lookup with flexible list insertion.

Choosing a good hash function is crucial; it should be fast (≈O(1)) and produce few collisions. MySQL’s hash partitioning uses the password function as an example.

When chains become long, they can be replaced by balanced trees (e.g., AVL) to improve lookup speed.

Summary – MySQL hash indexes are built on hash tables: the indexed column becomes the key, the hash function computes a bucket, and the bucket points to the row(s). Understanding arrays, dictionaries, linked lists, and hash tables helps grasp InnoDB adaptive hash indexes and hash joins.

MySQLdata structureshash tabledatabase indexingHash Index
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.