Fundamentals 22 min read

Object Persistence in Python Using Pickle and Related Techniques

This article explains Python object persistence, covering the concepts of serialization with pickle and cPickle, various storage mechanisms, handling of complex objects, reference cycles, class instance pickling, versioning strategies, and advanced techniques such as custom state methods and Pickler/Unpickler usage.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Object Persistence in Python Using Pickle and Related Techniques

What is Persistence?

Persistence means keeping objects alive across multiple executions of a program, typically by storing them on disk for later retrieval. Various methods exist, each with pros and cons, such as text files (CSV), relational databases (MySQL, PostgreSQL), and object‑oriented stores.

Object Persistence

Python provides the pickle module (and its faster C implementation cPickle ) to serialize arbitrary objects to strings, files, or file‑like objects and to reconstruct them later. Pickle can be used directly or via higher‑level object databases like ZODB or PyPerSyst.

Some Pickled Python Objects

The pickle and cPickle modules expose functions such as dumps() , loads() , dump() , and load() . By default they produce printable ASCII representations, but with the optional True flag they generate a more compact binary format. The functions automatically detect the format when loading.

<code>>> import cPickle as pickle
>>> t1 = ('this is a string', 42, [1, 2, 3], None)
>>> p1 = pickle.dumps(t1)
>>> t2 = pickle.loads(p1)
>>> p2 = pickle.dumps(t1, True)
>>> t3 = pickle.loads(p2)</code>

Using dump() and load() allows multiple objects to be stored sequentially in a single file.

<code>>> a1 = 'apple'
>>> b1 = {1: 'One', 2: 'Two', 3: 'Three'}
>>> c1 = ['fee', 'fie', 'foe', 'fum']
>>> f1 = file('temp.pkl', 'wb')
>>> pickle.dump(a1, f1, True)
>>> pickle.dump(b1, f1, True)
>>> pickle.dump(c1, f1, True)
>>> f1.close()
>>> f2 = file('temp.pkl', 'rb')
>>> a2 = pickle.load(f2)
>>> b2 = pickle.load(f2)
>>> c2 = pickle.load(f2)</code>

Pickle Power

Pickle handles complex objects, reference cycles, and recursive structures, preserving object identity within a single pickled graph.

<code>>> l = [1, 2, 3]
>>> l.append(l)
>>> p = pickle.dumps(l)
>>> l2 = pickle.loads(p)
>>> l2
[1, 2, 3, [...]]</code>

Separate pickling of objects can break shared references unless a Pickler is used to track them.

<code>>> f = file('temp.pkl', 'w')
>>> pickler = pickle.Pickler(f)
>>> pickler.dump(a)
>>> pickler.dump(b)
>>> f.close()
>>> f = file('temp.pkl', 'r')
>>> unpickler = pickle.Unpickler(f)
>>> c = unpickler.load()
>>> d = unpickler.load()
>>> c[2] is d
True</code>

Unpicklable Objects

File objects and other resources cannot be pickled directly; attempting to do so raises a TypeError .

<code>>> f = file('temp.pkl', 'w')
>>> pickle.dumps(f)
TypeError: can't pickle file objects</code>

Class Instances

When pickling class instances, only the instance data and the fully‑qualified class name are stored; the class code itself is not. Upon unpickling, Python imports the module containing the class. Custom _getstate_() and _setstate_() methods allow control over what gets serialized, useful for handling unpicklable attributes such as open files.

<code>class Foo(object):
    def __init__(self, value, filename):
        self.value = value
        self.logfile = file(filename, 'w')
    def __getstate__(self):
        f = self.logfile
        return (self.value, f.name, f.tell())
    def __setstate__(self, state):
        self.value, name, position = state
        f = file(name, 'w')
        f.seek(position)
        self.logfile = f</code>

Pattern Improvements

When class definitions evolve (renaming classes, adding/removing attributes, moving modules), custom _setstate_() logic can migrate old pickles to the new structure, preserving compatibility.

<code>def __setstate__(self, state):
    if 'fullname' not in state:
        first = state.get('firstname', '')
        last = state.get('lastname', '')
        self.fullname = " ".join([first, last]).strip()
        state.pop('firstname', None)
        state.pop('lastname', None)
    self.__dict__.update(state)</code>

Conclusion

Object persistence in Python relies on the language’s serialization capabilities, with pickle providing a robust foundation for storing and retrieving Python objects across program executions.

SerializationPersistenceobject-orienteddata storageVersioningPickle
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.