Fundamentals 22 min read

Understanding Python Modules, Packages, and the Import System

This article explains the concepts of Python modules and packages, the mechanics of import statements, the search path and caching behavior, the roles of finders and loaders, and how the import system has evolved through PEP 302 and PEP 420 to support regular and namespace packages.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Understanding Python Modules, Packages, and the Import System

1. Introduction

Python defines importing as a way for one module to gain access to another module's code. In everyday use we write statements such as from xxx import xxx or import xx , and many packages contain an __init__.py file that makes the directory a package.

2. Modules and Packages

2.1 What is a module?

A module is a .py file that logically groups variables, functions, and classes. The file name (e.g., module_name.py ) is the physical representation, while the module name (e.g., module_name ) is the logical representation. When a module is executed, the global variable __name__ holds its name, and the module code runs only once at import time.

2.2 What is a package?

A package is a special kind of module that provides a hierarchical namespace. It can be thought of as a directory containing modules and possibly sub‑packages. All packages are modules, but not all modules are packages. A module becomes a package when it defines the attribute __path__ , typically by containing an __init__.py file.

There are two kinds of packages:

Regular packages – directories with an __init__.py file. Importing a regular package executes its __init__.py , which can define objects that become part of the package namespace.

Namespace packages – collections of directories (or zip files) that share the same package name but may reside in different locations. They do not require an __init__.py file and are identified by the presence of a __path__ attribute that is an iterable of locations.

2.3 Import system

The most common way to import is the import statement, but you can also use importlib.import_module() or the built‑in __import__() function. The import statement first searches for the module, then binds the resulting object to a name in the current namespace.

3. Module/Package Location

3.1 Absolute and relative imports

Python supports absolute imports (recommended in Python 3) and relative imports (used in older code). Absolute imports can raise ImportError if the target module cannot be found; adding the local directory to sys.path or marking it as a source root in an IDE resolves the issue.

3.2 Module search path

When importing a module, Python first checks built‑in modules, then searches the directories listed in sys.path . The initial value of sys.path comes from the script’s directory, the PYTHONPATH environment variable, and the default installation directories.

Warning: Do not name your own modules the same as standard‑library modules, otherwise the local module will shadow the standard one.

3.3 Extending the search path with .pth files

You can add additional directories without modifying sys.path by creating a .pth file that lists absolute paths, one per line, and placing it in a location such as site-packages (Windows) or a directory returned by site.getsitepackages() (Linux).

<code>import site
print(site.getsitepackages())
# Output
['/Users/gray/anaconda3/anaconda3/envs/python-develop/lib/python3.7/site-packages']
</code>

4. Deep Dive into Import Search

The import search populates sys.modules with module objects. If a module is already cached, the cached object is returned; otherwise the search proceeds through meta‑path finders.

4.1 Cache

Before searching, Python checks sys.modules . If the module name is present, its value is used; if the value is None , a ModuleNotFoundError is raised. Deleting a key from sys.modules forces a fresh search, but the recommended way to reload a module is importlib.reload() .

4.2 Finder and Loader

If the cache misses, the import protocol uses a finder to locate a module and returns a module spec . An object that implements both finder and loader interfaces is called an importer . Python ships with built‑in finders for built‑in modules, frozen modules, and path‑based modules.

4.3 Import hooks

Import hooks allow extending the import mechanism. Meta hooks are inserted into sys.meta_path and run before the cache check, while import‑path hooks are added to sys.path_hooks and operate on each entry of sys.path .

4.4 Meta‑path

When a module is not found in the cache, Python iterates over sys.meta_path . Each finder’s find_spec(fullname, path, target=None) method is called; if none return a spec, a ModuleNotFoundError is raised.

5. Import Loading Mechanism

The following pseudo‑code illustrates the loading phase:

<code>module = None
if spec.loader is not None and hasattr(spec.loader, 'create_module'):
    module = spec.loader.create_module(spec)
if module is None:
    module = ModuleType(spec.name)
_init_module_attrs(spec, module)
if spec.loader is None:
    if spec.submodule_search_locations is not None:
        # namespace package
        sys.modules[spec.name] = module
    else:
        raise ImportError
elif not hasattr(spec.loader, 'exec_module'):
    module = spec.loader.load_module(spec.name)
else:
    sys.modules[spec.name] = module
    try:
        spec.loader.exec_module(module)
    except BaseException:
        try:
            del sys.modules[spec.name]
        except KeyError:
            pass
        raise
return sys.modules[spec.name]
</code>

Key points:

The module is cached in sys.modules before execution to avoid recursive imports.

If loading fails, the failed entry is removed from the cache.

The loader’s exec_module() method actually executes the module code.

5.1 Loader objects

A loader is an instance of importlib.abc.Loader . It must provide exec_module() (and optionally create_module() ) to execute the module’s code. If it cannot, it should raise ImportError .

5.2 Module spec objects

A ModuleSpec carries information such as the module’s name, loader, origin, and submodule_search_locations . It is attached to the module as __spec__ and guides the loading process.

5.3 Module attributes set during import

Attribute

Description

__name__

Fully qualified module name.

__loader__

Loader object used for introspection.

__package__

Package name used for relative imports.

__spec__

ModuleSpec instance.

__path__

Iterable of locations for packages; empty for regular modules.

__file__

Path to the source file (optional for built‑ins).

__cached__

Path to the compiled byte‑code file.

6. Path‑Based Finder

The default PathBasedFinder searches each entry in sys.path using sys.path_hooks to obtain a PathEntryFinder . Results are cached in sys.path_importer_cache to avoid repeated work.

6.1 Path entry finder protocol

A PathEntryFinder implements find_spec(fullname, target=None) . For namespace‑package portions it returns a spec with loader=None and submodule_search_locations set to the list of locations that constitute the portion.

Original source: CSDN article

PythonModulespackagesImport Systemsys.pathNamespace PackagesPEP302
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.