Fundamentals 10 min read

Handling Various Compressed File Formats in Python (zip, tar.gz, rar, 7z)

This article explains how to work with common compressed file formats such as zip, tar.gz, rar, and 7z in Python, detailing the relevant modules, usage patterns, and providing complete code examples for creating and extracting archives on both Windows and Linux.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Handling Various Compressed File Formats in Python (zip, tar.gz, rar, 7z)

In daily work, Python is often used to process text files and also to handle various compressed file formats.

Common compression formats include:

rar – widely used on Windows, typically with WinRAR.

tar – Linux packaging tool that only archives without compression.

gz (gzip) – compresses a single file; combined with tar to first archive then compress.

tgz – tar archive compressed with gzip.

zip – can archive and compress multiple files, using algorithms similar to gzip.

7z – format supported by 7‑Zip with high compression efficiency.

zip files

The zipfile module provides ZipFile and ZipInfo classes for creating and reading zip archives.

Example code:

import os
import zipfile

# compress
def make_zip(source_dir, output_filename):
    zipf = zipfile.ZipFile(output_filename, 'w')
    pre_len = len(os.path.dirname(source_dir))
    for parent, dirnames, filenames in os.walk(source_dir):
        for filename in filenames:
            print(filename)
            pathfile = os.path.join(parent, filename)
            arcname = pathfile[pre_len:].strip(os.path.sep)  # relative path
            zipf.write(pathfile, arcname)
        print()
    zipf.close()

# decompress
def un_zip(file_name):
    """unzip zip file"""
    zip_file = zipfile.ZipFile(file_name)
    if os.path.isdir(file_name + "_files"):
        pass
    else:
        os.mkdir(file_name + "_files")
    for names in zip_file.namelist():
        zip_file.extract(names, file_name + "_files/")
    zip_file.close()

if __name__ == '__main__':
    make_zip(r"E:python_samplelibstest_tar_fileslibs", "test.zip")
    un_zip("test.zip")

tar.gz files

The tarfile module can read and write tar archives, optionally compressed with gzip, bzip2, or lzma. The mode string follows the pattern filemode[:compression] (e.g., 'r:gz' for reading a gzip‑compressed archive). Common modes are listed in the table below.

Mode

Action

'r' or 'r:*'

Open and read with transparent compression (recommended).

'r:'

Open and read without compression.

'r:gz'

Open and read using gzip compression.

'r:bz2'

Open and read using bzip2 compression.

'r:xz'

Open and read using lzma compression.

'x'

Create a tarfile without compression (fails if file exists).

'x:gz'

Create a gzip‑compressed tarfile.

'w'

Open for uncompressed writing.

'w:gz'

Open for gzip‑compressed writing.

Example code:

import os
import tarfile
import gzip

# pack entire directory (empty sub‑directories are included)
def make_targz(output_filename, source_dir):
    with tarfile.open(output_filename, "w:gz") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))

# pack files one by one (empty sub‑directories are omitted)
def make_targz_one_by_one(output_filename, source_dir):
    tar = tarfile.open(output_filename, "w:gz")
    for root, dir, files in os.walk(source_dir):
        for file in files:
            pathfile = os.path.join(root, file)
            tar.add(pathfile)
    tar.close()

def un_gz(file_name):
    """ungz zip file"""
    f_name = file_name.replace(".gz", "")
    g_file = gzip.GzipFile(file_name)
    open(f_name, "wb+").write(g_file.read())
    g_file.close()

def un_tar(file_name):
    # untar zip file
    tar = tarfile.open(file_name)
    names = tar.getnames()
    if not os.path.isdir(file_name + "_files"):
        os.mkdir(file_name + "_files")
    for name in names:
        tar.extract(name, file_name + "_files/")
    tar.close()

if __name__ == '__main__':
    make_targz('test.tar.gz', "E:python_samplelibs")
    make_targz_one_by_one('test01.tgz', "E:python_samplelibs")
    un_gz("test.tar.gz")
    un_tar("test.tar")

rar files

Extraction of .rar archives can be done with the rarfile library, which requires the external unrar library. Installation steps differ for Windows and Linux.

Windows installation

Download the official UnRARDLL.exe from RARLab and install (default path C:\Program Files (x86)\UnrarDLL).

Add an environment variable UNRAR_LIB_PATH pointing to the DLL (e.g., C:\Program Files (x86)\UnrarDLL\UnRAR.dll for 32‑bit).

Run pip install unrar and the Python code will work.

Linux installation

Download the source tarball from RARLab, extract, compile, and install to produce libunrar.so .

Set UNRAR_LIB_PATH in /etc/profile and source the file.

Run pip install unrar .

Example code:

import rarfile

def unrar(rar_file, dir_name):
    # rarfile needs unrar support; on Linux install unrar, on Windows add unrar to PATH
    rarobj = rarfile.RarFile(rar_file.decode('utf-8'))
    rarobj.extractall(dir_name.decode('utf-8'))

7z files

For creating and extracting .7z archives, use the py7zr library.

Example code:

import py7zr

# extract
with py7zr.SevenZipFile("Archive.7z", 'r') as archive:
    archive.extractall(path="/tmp")

# compress
with py7zr.SevenZipFile("Archive.7z", 'w') as archive:
    archive.writeall("target/")

All code examples are ready to run on both Windows and Linux after the appropriate libraries are installed.

file compressionziptarzipfilerar7ztarfile
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.