Python Scripts for File Management, Data Processing, Automation, and More
This article provides a collection of practical Python code snippets covering file and directory management, data processing, network requests, automation tasks, document handling, image manipulation, system monitoring, visualization, cleaning, logging, and web scraping, all presented with clear explanations and ready-to-use examples.
File and Directory Management
Batch rename files in a directory based on a rule:
import os
for filename in os.listdir('.'):
if filename.endswith(".txt"):
os.rename(filename, "new_" + filename)Find large files exceeding a size threshold:
import os
for root, dirs, files in os.walk('.'):
for file in files:
path = os.path.join(root, file)
if os.path.getsize(path) > 1e6: # larger than 1 MB
print(path)Copy only the folder structure without file contents:
import shutil
shutil.copytree('source_folder', 'destination_folder', copy_function=lambda src, dst: None)Data Processing and Analysis
Merge multiple CSV files into one:
import pandas as pd
df_list = [pd.read_csv(f) for f in ['file1.csv', 'file2.csv']]
combined_df = pd.concat(df_list, ignore_index=True)
combined_df.to_csv('combined.csv', index=False)Convert an Excel file to CSV:
import pandas as pd
excel_data = pd.read_excel('data.xlsx')
excel_data.to_csv('data.csv', index=False)Remove duplicate lines from a text file:
with open('input.txt') as f_in, open('output.txt', 'w') as f_out:
seen = set()
for line in f_in:
if line not in seen:
f_out.write(line)
seen.add(line)Read and write JSON data:
import json
data = {'key': 'value'}
with open('data.json', 'w') as f:
json.dump(data, f)
with open('data.json') as f:
loaded_data = json.load(f)Network Requests and API Interaction
Send a GET request and print the JSON response:
import requests
response = requests.get('https://api.example.com/data')
print(response.json())Download a file from the internet and save it locally:
import requests
url = 'https://example.com/file.zip'
r = requests.get(url)
with open('file.zip', 'wb') as f:
f.write(r.content)Automation Tasks
Schedule a recurring job using APScheduler:
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
def job():
print("Task executed!")
sched.add_job(job, 'interval', minutes=1)
sched.start()Send an email automatically via SMTP:
import smtplib
from email.mime.text import MIMEText
msg = MIMEText('This is the body of the email.')
msg['Subject'] = 'Subject line'
msg['From'] = '[email protected]'
msg['To'] = '[email protected]'
with smtplib.SMTP('smtp.example.com') as server:
server.login('user', 'pass')
server.sendmail('[email protected]', ['[email protected]'], msg.as_string())Document Processing
Create a simple Word document:
from docx import Document
doc = Document()
doc.add_paragraph('Hello World!')
doc.save('hello.docx')Merge multiple PDF files into one:
from PyPDF2 import PdfFileMerger
merger = PdfFileMerger()
for pdf in ['file1.pdf', 'file2.pdf']:
merger.append(pdf)
merger.write("merged.pdf")Image Processing
Resize an image using Pillow:
from PIL import Image
img = Image.open('image.jpg')
resized_img = img.resize((800, 600))
resized_img.save('resized_image.jpg')Text Analysis
Count word frequencies in a text file:
from collections import Counter
with open('text.txt') as f:
words = f.read().split()
word_counts = Counter(words)
print(word_counts.most_common(10))System Information
Get current CPU usage percentage:
import psutil
print(psutil.cpu_percent(interval=1))Data Visualization
Plot a simple line chart with Matplotlib:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 4, 9, 16]
plt.plot(x, y)
plt.show()Data Cleaning
Remove empty rows from a CSV file:
import pandas as pd
df = pd.read_csv('dirty_data.csv')
df.dropna(inplace=True)
df.to_csv('clean_data.csv', index=False)Logging
Record log messages to a file using the logging module:
import logging
logging.basicConfig(filename='app.log', level=logging.INFO)
logging.info('This is an info message.')Web Scraping
Fetch a webpage and extract all H1 titles with BeautifulSoup:
import requests
from bs4 import BeautifulSoup
url = 'http://example.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
titles = soup.find_all('h1')
for title in titles:
print(title.text.strip())Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.