Decrypting Password‑Protected Zip Files with Python
This article explains how to use Python's built‑in zipfile module and the third‑party rarfile library to brute‑force and decrypt encrypted zip archives, handle Chinese filename encoding issues, and generate password permutations efficiently with itertools for flexible password lengths.
Background
The author receives an encrypted zip folder containing old photos and needs to unlock it. The goal is to try various passwords and extract the files using Python.
Choosing the Tools
zipfile : built‑in Python module, no installation required.
rarfile : third‑party library; install it and refer to its documentation at https://rarfile.readthedocs.io/api.html.
Both modules provide an extractall method for extracting archives.
Basic Extraction Example
<code>import zipfile
try:
# Create ZipFile object
with zipfile.ZipFile('test_file.zip') as zfile:
# Extract files
zfile.extractall(path='./')
print('File extracted successfully')
except:
print('Failed!')
</code>Extracting a Password‑Protected Zip
Assuming the password is 1234 :
<code>import zipfile
try:
with zipfile.ZipFile('511.zip') as zfile:
zfile.extractall(path='./', pwd=b"1234")
print('File extracted successfully')
except:
print('Failed!')
</code>Fixing Chinese Filename Garbling
When extracting files with Chinese names, the filenames may become garbled. The fix involves editing the Python standard library file Lib/zipfile.py :
<code># Locate the line
fname_str = fname.decode("cp437")
# Replace it with
fname_str = fname_str.encode("cp437").decode('gbk')
</code>Similarly, modify the later occurrence:
<code>filename = filename.decode('cp437')
# Replace it with
filename = filename.encode("cp437").decode('gbk')
</code>Brute‑Force Password Search
A naïve four‑nested‑loop approach generates all 4‑character combinations from a given character set:
<code>def get_pwds(my_password_str):
for i1 in range(len(my_password_str)):
for i2 in range(len(my_password_str)):
for i3 in range(len(my_password_str)):
for i4 in range(len(my_password_str)):
yield my_password_str[i1] + my_password_str[i2] + my_password_str[i3] + my_password_str[i4]
</code>Running this against the zip file quickly finds the correct password ( aaaf in the example).
Improved Approach with itertools
Python's itertools.permutations can generate fixed‑length password permutations more efficiently:
<code>import itertools
my_pwdstr = 'abcdefghijklmnopqrstuvwxyz0123456789'
for x in itertools.permutations(my_pwdstr, 4):
pwd = ''.join(x)
# test pwd
</code>The function signature is itertools.permutations(iterable, r=None) , where r is the length of each permutation.
Full Script with Dynamic Password Length
<code>import zipfile
import itertools
def ext_file(pwd):
try:
with zipfile.ZipFile('test_chinese.zip') as zfile:
zfile.extractall(path='./', pwd=pwd.encode('utf-8'))
print('File extracted successfully')
return True
except Exception as e:
print('Failed!', e)
return False
def get_pwds(my_password_str, nums):
for x in itertools.permutations(my_password_str, nums):
yield ''.join(x)
if __name__ == '__main__':
my_password_str = "abcdefghijklmnopqrstuvwxyz0123456789"
for pwd in get_pwds(my_password_str, 4):
print('Testing password:', pwd)
if ext_file(pwd):
print('Decryption successful, password is', pwd)
break
</code>This version lets you change the password length ( nums ) without modifying the core logic.
Further Ideas
Use a password dictionary file instead of generating passwords programmatically.
Speed up cracking with multithreading or multiprocessing, assigning each process a portion of the dictionary.
Real‑World Outcome
After several days of trial, the author finally succeeded: the password turned out to be the first letter of a surname followed by 123789 , illustrating that simple passwords are often used.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.