Comprehensive Guide to Python File Operations, CSV Handling, and Data Serialization
This article provides a thorough tutorial on Python file I/O, covering opening and closing files, mode flags, absolute and relative paths, reading and writing techniques, file copying, CSV read/write, in‑memory streams, sys module redirection, and serialization with JSON and pickle, including code examples and best practices.
1. Opening and Closing Files
The built‑in open() function opens a file and returns a file handle (e.g., f1 ). You can specify the file path, mode (default 'r' for read), and encoding (default depends on OS).
<code>f1 = open(r'd:\test_file.txt', mode='r', encoding='utf-8')
content = f1.read()
print(content)
f1.close()</code>Using the context manager automatically closes the file:
<code>with open(r'd:\test_file.txt', mode='r', encoding='utf-8') as f1:
content = f1.read()
print(content)</code>open() is a built‑in function that ultimately calls the operating‑system API.
The file handle (e.g., f1 , fh , file_handler ) is used for all subsequent operations.
encoding defaults to the OS setting (Windows GBK, Linux/macOS UTF‑8).
mode defaults to 'r' (read‑only).
f1.close() explicitly closes the handle.
Using with open() avoids manual closing.
Absolute and Relative Paths
Absolute path: full location, e.g., C:/Users/chris/AppData/Local/Programs/Python/Python37/python.exe .
Relative path: starts from the current directory, e.g., test.txt , ./test.txt , ../test.txt , demo/test.txt .
Three ways to write Windows paths: \ – file = open('C:\Users\chris\Desktop\file.txt') Raw string r'\' – file = open(r'C:\Users\chris\Desktop\file.txt') Forward slash (recommended) – file = open('C:/Users/chris/Desktop/file.txt')
Common File Modes
Text modes:
<code>r # read‑only (default, file must exist)
w # write‑only (creates or truncates)
a # append‑only (creates if missing)
</code>Binary modes (use b ):
<code>rb # read binary
wb # write binary
ab # append binary
</code>Read‑write modes with + :
<code>r+ # read/write, pointer starts at beginning
w+ # write/read, creates/truncates then reads
a+ # append/read, creates if missing
</code>2. Reading and Writing Files
Reading
Assume a file read.txt contains several lines.
<code>f1 = open('read.txt', encoding='utf-8')
content = f1.read()
print(content, type(content))
f1.close()
</code>read(n) reads n characters (text mode) or bytes (binary mode):
<code>f1 = open('read.txt', encoding='utf-8')
content = f1.read(6)
print(content) # prints first 6 characters
f1.close()
</code>readline() reads a single line, readlines() returns a list of all lines (may consume much memory for large files).
<code>f1 = open('read.txt', encoding='utf-8')
print(f1.readline().strip())
print(f1.readline())
f1.close()
</code>Iterating over the file object reads line by line, which is memory‑efficient:
<code>f1 = open('read.txt', encoding='utf-8')
for line in f1:
print(line.strip())
f1.close()
</code>Writing
Opening a file in 'w' creates it if missing or truncates it if it exists.
<code>f1 = open('write.txt', encoding='utf-8', mode='w')
f1.write('Lucy is awesome')
f1.close()
</code>Binary write ( 'wb' ) is used for non‑text data such as images:
<code>f1 = open('image.png', mode='rb')
content = f1.read()
f1.close()
f2 = open('copy.png', mode='wb')
f2.write(content)
f2.close()
</code>File Pointer Positioning
tell() returns the current offset; seek(offset, whence) moves the pointer ( whence : 0‑start, 1‑current, 2‑end).
<code>f = open('test.txt')
print(f.read(10))
print(f.tell())
f.seek(2, 0) # from start, skip 2 bytes
print(f.read())
f.seek(1, 1) # from current, skip 1 byte
print(f.read())
f.seek(-4, 2) # from end, go back 4 bytes
print(f.read())
f.close()
</code>3. Implementing a File‑Copy Function
<code>import os
file_name = input('Enter a file path:')
if os.path.isfile(file_name):
old_file = open(file_name, 'rb')
base, ext = os.path.splitext(file_name)
new_file_name = base + '.bak' + ext
new_file = open(new_file_name, 'wb')
while True:
content = old_file.read(1024)
new_file.write(content)
if not content:
break
new_file.close()
old_file.close()
else:
print('The file you entered does not exist')
</code>4. CSV File Read/Write
CSV stores tabular data as plain text with commas separating fields and newlines separating rows.
<code>name,age,score
zhangsan,18,98
lisi,20,99
wangwu,17,90
jerry,19,95
</code>Writing CSV with the csv module:
<code>import csv
file = open('test.csv', 'w')
writer = csv.writer(file)
writer.writerow(['name', 'age', 'score'])
writer.writerows([
['zhangsan', '18', '98'],
['lisi', '20', '99'],
['wangwu', '17', '90'],
['jerry', '19', '95']
])
file.close()
</code>Reading CSV:
<code>import csv
file = open('test.csv', 'r')
reader = csv.reader(file)
for row in reader:
print(row)
file.close()
</code>5. Writing Data to Memory
StringIO works like a file object for text data stored in memory:
<code>from io import StringIO
f = StringIO()
f.write('hello\r\n')
f.write('good')
print(f.getvalue())
f.close()
</code>BytesIO does the same for binary data:
<code>from io import BytesIO
f = BytesIO()
f.write('你好\r\n'.encode('utf-8'))
f.write('中国'.encode('utf-8'))
print(f.getvalue())
f.close()
</code>6. Using the sys Module
sys.stdin reads user input; input() is a wrapper around it.
<code>import sys
s_in = sys.stdin
while True:
content = s_in.readline().rstrip('\n')
if content == '':
break
print(content)
</code>Redirecting standard output to a file:
<code>import sys
m = open('stdout.txt', 'w', encoding='utf8')
sys.stdout = m
print('hello')
print('yes')
print('good')
m.close()
</code>Redirecting standard error produces a traceback file:
<code>import sys
x = open('stderr.txt', 'w', encoding='utf8')
sys.stderr = x
print(1/0)
x.close()
</code>7. Serialization and Deserialization
To store complex Python objects, they must be serialized. Two common formats are JSON (text) and pickle (binary).
JSON Module
Convert Python objects to JSON strings with json.dumps() or directly write with json.dump() :
<code>import json
names = ['zhangsan', 'lisi', 'wangwu', 'jerry', 'henry', 'merry', 'chris']
result = json.dumps(names)
file = open('names.txt', 'w')
file.write(result)
file.close()
# or using dump
file = open('names.txt', 'w')
json.dump(names, file)
file.close()
</code>Load JSON strings back to Python objects with json.loads() or json.load() :
<code>result = json.loads('["zhangsan", "lisi", "wangwu"]')
print(type(result)) # <class 'list'>
file = open('names.txt', 'r')
result = json.load(file)
print(result)
file.close()
</code>Custom objects require a custom encoder:
<code>import json
class MyEncode(json.JSONEncoder):
def default(self, o):
return o.__dict__
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
p1 = Person('zhangsan', 18)
result = json.dumps(p1, cls=MyEncode)
print(result) # {"name": "zhangsan", "age": 18}
</code>pickle Module
pickle serializes Python objects to binary data.
<code>import pickle
names = ['张三', '李四', '杰克', '亨利']
# dumps returns binary
b_names = pickle.dumps(names)
file = open('names.txt', 'wb')
file.write(b_names)
file.close()
# dump writes directly to a file
file2 = open('names.txt', 'wb')
pickle.dump(names, file2)
file2.close()
# loads restores from binary
file1 = open('names.txt', 'rb')
x = file1.read()
y = pickle.loads(x)
print(y)
file1.close()
# load reads and restores directly
file3 = open('names.txt', 'rb')
z = pickle.load(file3)
print(z)
</code>JSON vs. pickle
JSON produces human‑readable text and is suitable for cross‑platform data exchange; only basic Python types are supported.
pickle produces binary data that preserves the exact Python object structure but is Python‑specific and not safe for untrusted sources.
Both modules provide dump/dumps for serialization and load/loads for deserialization.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.