Fundamentals 10 min read

Introduction to Parallel Programming and Python Parallel Libraries

This article introduces parallel programming concepts, memory architectures, execution models, Python threading versus multiprocessing performance, and reviews several Python parallel libraries such as Ray, Dask, Dispy, ipyparallel, and Joblib for building scalable concurrent applications.

Python Programming Learning Circle

Mar 18, 2023

Introduction to Parallel Programming and Python Parallel Libraries

Parallel Programming Introduction

Parallel programming is a method where multiple threads or processes execute different tasks simultaneously, improving performance and throughput by leveraging multi‑core processors.

Advantages:

Improved performance and throughput

Utilization of multi‑core advantage

Better resource management

Distributed computation

Disadvantages:

Increased program complexity

Need to handle synchronization and deadlock

Higher debugging cost

Resource contention issues

Parallel Computing Memory Architectures

Two main memory architectures:

Shared memory: multiple processors share a single memory space, reducing data transfer time.

Distributed memory: each processor has its own memory, requiring network communication.

Computer classifications based on instruction and data parallelism:

SISD (Single Instruction, Single Data)

SIMD (Single Instruction, Multiple Data)

MISD (Multiple Instruction, Single Data)

MIMD (Multiple Instruction, Multiple Data)

SISD

SISD describes a classic single‑processor system where one instruction operates on one data item at a time; execution is sequential.

MISD

MISD involves multiple instruction streams operating on the same data stream, useful for special cases like encryption, but rarely used in practice.

SIMD

SIMD uses one control unit to drive multiple processors that perform the same operation on different data elements, enabling spatial parallelism.

MIMD

MIMD consists of multiple independent processors that can execute different instructions on different data, offering the strongest computational power.

Parallel Programming Memory Management

Performance is limited if memory cannot supply instructions and data fast enough. Two models:

Shared memory systems with equal access to a large virtual address space.

Distributed memory models where each processor’s memory is private.

Parallel Programming Models

Models define how software accesses memory and decomposes tasks.

Shared Memory Model

All tasks share one memory space; synchronization primitives such as locks and semaphores control access.

Multithreaded Model

A single processor can run multiple threads that operate on shared memory, requiring careful synchronization.

Message‑Passing Model

Used mainly in distributed memory systems; tasks may reside on multiple physical machines.

Data Parallel Model

Multiple tasks operate on different partitions of the same data structure, often with local memory copies.

Python Threads and Processes

Python supports both multithreading and multiprocessing. Threads are the smallest execution unit; processes contain at least one thread. Scheduling is handled by the OS.

耗时分析

CPU密集

IO密集

网络密集

线性运算

多线程

101

多进程

Multithreading shows little advantage for CPU‑bound work and can be slower due to context switching, but it helps in I/O‑bound scenarios. Multiprocessing generally outperforms multithreading for CPU‑bound tasks and also scales well for I/O‑bound workloads, though it consumes more resources.

Python Parallel Programming Libraries

Ray

https://ray.io

Ray can distribute any Python task across machines, not limited to machine‑learning workloads.

Dask

https://www.dask.org/

Dask uses a centralized scheduler to scatter tasks across a cluster.

Dispy

https://dispy.org/

Dispy runs computations in parallel across many machines, suitable for data‑parallel scenarios.

ipyparallel

https://github.com/ipython/ipyparallel

ipyparallel executes Jupyter notebook code across a cluster, distributing function calls evenly.

Joblib

https://github.com/joblib/joblib

Joblib provides lightweight pipelines and can share memory‑mapped arrays between processes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Python multithreading Distributed computing Parallel Programming multiprocessing

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.