Understanding Threads, Processes, GIL, and Multiprocessing in Python
This article explains the fundamental differences between threads and processes, the role of Python's Global Interpreter Lock (GIL), and provides a comprehensive guide to using the multiprocessing module and concurrent.futures for parallel execution, including code examples and synchronization primitives.
Thread and Process Differences
Processes are the smallest unit of resource allocation, while threads are the smallest unit of CPU scheduling. A process has its own virtual address space and resources recorded in a PCB, whereas threads within the same process share the address space and resources.
Address space and resources: processes are independent; threads share the same process space.
Communication: inter‑process communication (IPC) vs. direct memory access for threads.
Scheduling and switching: thread context switches are much faster than process switches.
In modern OSes, threads are a key indicator of concurrency.
Comparison Table
Dimension
Multiprocess
Multithread
Summary
Data sharing, synchronization
Complex sharing, simple sync
Simple sharing, complex sync
Each has pros and cons
Memory, CPU
High memory, complex switch, low CPU utilization
Low memory, simple switch, high CPU utilization
Threads win
Creation, destruction, switch
Complex, slow
Simple, fast
Threads win
Programming, debugging
Simple programming, simple debugging
Complex programming, complex debugging
Processes win
Reliability
Processes do not affect each other
A thread crash terminates the whole process
Processes win
Distributed
Suitable for multi‑node, easy to scale
Suitable for multi‑core only
Processes win
Python Global Interpreter Lock (GIL)
The GIL ensures that only one thread executes Python bytecode at a time in CPython, simplifying the interpreter implementation but limiting true parallelism on multi‑core machines.
Execution steps under the GIL:
Acquire GIL
Switch to a thread
Run until a bytecode count limit or the thread yields (e.g., sleep(0) )
Put the thread to sleep
Release GIL
Repeat
Before Python 3.2 the GIL was released only on I/O or after 100 bytecode ticks. Since Python 3.2 a timeout (≈5 ms) forces the lock to be released, improving behavior on single‑core systems.
Typical mitigation strategies:
Upgrade to a newer Python version with an improved GIL.
Use multiprocessing instead of multithreading.
Set CPU affinity for threads.
Use GIL‑free interpreters such as Jython or IronPython.
Restrict multithreading to I/O‑bound workloads.
Employ coroutines (asyncio) for efficient single‑threaded concurrency.
Write performance‑critical parts in C/C++ and release the GIL with with nogil .
Python Multiprocessing Package
Because of the GIL, the multiprocessing package provides a process‑based parallelism API that mirrors the threading module.
Background of Multiprocessing
On Unix‑like systems the fork() system call clones the current process. Windows lacks fork() , so multiprocessing emulates the behavior to offer a cross‑platform API.
<code>import os
print('Process (%s) start...' % os.getpid())
# Only works on Unix/Linux/Mac:
pid = os.fork()
if pid == 0:
print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
print('I (%s) just created a child process (%s).' % (os.getpid(), pid))
</code>Common Components
Process : represents a single process. Constructor: Process([group, target, name, args, kwargs]) . Key methods: start() , run() , terminate() , join() , is_alive() .
Pool : manages a pool of worker processes. Constructor: Pool([processes, initializer, initargs, maxtasksperchild, context]) . Methods include apply() , apply_async() , map() , map_async() , close() , join() , terminate() .
Queue , JoinableQueue : inter‑process communication queues.
Value and Array : shared memory objects based on ctypes .
Pipe : creates a two‑way communication channel returning (conn1, conn2) .
Manager : runs a server process that hosts shared objects (list, dict, Namespace, etc.).
Process Example
<code>from multiprocessing import Process
import os
def run_proc(name):
print('Run child process %s (%s)...' % (name, os.getpid()))
if __name__ == '__main__':
print('Parent process %s.' % os.getpid())
p = Process(target=run_proc, args=('test',))
print('Child process will start.')
p.start()
p.join()
print('Child process end.')
</code>Pool Example
<code>from multiprocessing import Pool
def test(i):
print(i)
if __name__ == '__main__':
pool = Pool(8)
pool.map(test, range(100))
pool.close()
pool.join()
</code>Queue Example
<code>from multiprocessing import Process, Queue
import os, time, random
def write(q):
print('Process to write: %s' % os.getpid())
for v in ['A', 'B', 'C']:
print('Put %s to queue...' % v)
q.put(v)
time.sleep(random.random())
def read(q):
print('Process to read: %s' % os.getpid())
while True:
v = q.get(True)
print('Get %s from queue.' % v)
if __name__ == '__main__':
q = Queue()
pw = Process(target=write, args=(q,))
pr = Process(target=read, args=(q,))
pw.start()
pr.start()
pw.join()
pr.terminate()
</code>Value and Array Example
<code>import multiprocessing
def f(n, a):
n.value = 3.14
a[0] = 5
if __name__ == '__main__':
num = multiprocessing.Value('d', 0.0)
arr = multiprocessing.Array('i', range(10))
p = multiprocessing.Process(target=f, args=(num, arr))
p.start()
p.join()
print(num.value)
print(arr[:])
</code>Pipe Example
<code>from multiprocessing import Process, Pipe
import time
def child(conn):
time.sleep(1)
conn.send('Did you eat?')
print('From parent:', conn.recv())
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=child, args=(child_conn,))
p.start()
print('From child:', parent_conn.recv())
parent_conn.send('Yes')
</code>Manager Example
<code>import multiprocessing
def f(x, arr, l, d, n):
x.value = 3.14
arr[0] = 5
l.append('Hello')
d[1] = 2
n.a = 10
if __name__ == '__main__':
server = multiprocessing.Manager()
x = server.Value('d', 0.0)
arr = server.Array('i', range(10))
l = server.list()
d = server.dict()
n = server.Namespace()
proc = multiprocessing.Process(target=f, args=(x, arr, l, d, n))
proc.start()
proc.join()
print(x.value)
print(arr)
print(l)
print(d)
print(n)
</code>Synchronization Primitives
Lock : mutual exclusion.
RLock : re‑entrant lock.
Semaphore : allows a limited number of concurrent accesses.
Condition : advanced lock with wait/notify.
Event : simple flag for signaling.
Lock Example
<code>from multiprocessing import Process, Lock
def task(lock, num):
lock.acquire()
print('Hello Num: %s' % num)
lock.release()
if __name__ == '__main__':
lock = Lock()
for i in range(20):
Process(target=task, args=(lock, i)).start()
</code>Semaphore Example
<code>from multiprocessing import Process, Semaphore
import time, random
def go_wc(sem, user):
sem.acquire()
print('%s occupies a stall' % user)
time.sleep(random.randint(0, 3))
sem.release()
print(user, 'OK')
if __name__ == '__main__':
sem = Semaphore(2)
procs = []
for i in range(5):
p = Process(target=go_wc, args=(sem, 'user%s' % i))
p.start()
procs.append(p)
for p in procs:
p.join()
</code>Python Concurrency with concurrent.futures
The concurrent.futures module (available since Python 3.2) provides high‑level abstractions ThreadPoolExecutor and ProcessPoolExecutor for asynchronous execution.
Executor Overview
ThreadPoolExecutor(max_workers) : runs callables in a pool of threads.
ProcessPoolExecutor(max_workers=None) : runs callables in a pool of processes (defaults to CPU count).
submit(fn, *args, **kwargs) : schedules a callable and returns a Future .
map(fn, *iterables, timeout=None) : returns an iterator of results preserving order.
shutdown(wait=True) : releases resources; automatically called when using a with block.
Future Overview
result(timeout=None) : returns the callable's return value or raises TimeoutError / CancelledError .
exception(timeout=None) : returns the exception raised by the callable.
add_done_callback(fn) : registers a callback executed when the future completes.
as_completed(fs, timeout=None) : yields futures as they finish.
Example with ThreadPoolExecutor
<code>from concurrent import futures
import time, random
def test(num):
time.sleep(random.randint(1, 5))
return time.ctime(), num
with futures.ThreadPoolExecutor(max_workers=5) as executor:
futures_list = [executor.submit(test, i) for i in range(5)]
for f in futures.as_completed(futures_list):
print(f.result())
</code>These tools enable developers to write clear, maintainable parallel code without dealing directly with low‑level thread or process management.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.