Skip to content
Home » My Blog Tutorial » Garbage Collection Python

Garbage Collection Python

Introduction

Welcome back, everyone! In today’s post, the intricacies of garbage collection in Python will be delved into. An influential aspect of Python programming, garbage collection, is responsible for the efficiency of code execution. Its basic principles, implementation in Python, and ways to optimize code performance through garbage collection will be explored.

Garbage Collection Python

The core of garbage collection in Python lies in managing memory usage by discarding objects that are no longer needed. This process depends on reference counting. Python associates a reference count with each object, indicating how many references point to it. When this count drops to zero, showing no references to the object, Python’s garbage collector can delete the object, thereby freeing up memory.

Practical Implementation

These concepts in Python will be experimented with. The manipulation and observation of garbage collection can be achieved using the sys and gc (garbage collection) modules. For instance, the reference count of an object is revealed by sys.getrefcount(), and what references an object is shown by gc.get_referents().

Running in Terminal

Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import gc
>>> 
>>> a = 'Hello World'
>>> sys.getrefcount(a)      
2
>>> my_list = []
>>> my_list.append(a) 
>>> sys.getrefcount(my_list)
2
>>> sys.getrefcount(a)       
3
>>> gc.get_referrers(a)
[['Hello World'], {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, 'gc': <module 'gc' (built-in)>, 'a': 'Hello World', 'my_list': ['Hello World']}]
>>> gc.get_threshold()  
(700, 10, 10)
>>> gc.set_threshold(1000, 20, 30)
>>> gc.get_threshold()
(1000, 20, 30)
>>> gc.get_count()
(629, 5, 1)
>>> gc.set_debug(True)

Description Code

The Python script, running on Python 3.11.5 in a Windows 32-bit environment, focuses on memory management and garbage collection. It begins by importing two essential modules: sys for system operations and gc for garbage collection.

The script’s primary function is to explore Python’s memory management by creating a string a with the value ‘Hello World’ and examining its reference count using sys.getrefcount(a). Initially, the count is 2, indicating two references to a. When a is appended to a newly created list my_list, its reference count increases to 3, as my_list now holds a reference to a.

To understand memory retention, the script uses gc.get_referrers(a) to list objects referencing a. This offers insight into the interconnectivity and retention of objects in memory.

Attention then shifts to garbage collection thresholds. The script fetches the current thresholds (700, 10, 10) using gc.get_threshold(), representing the thresholds for Python’s three garbage collection generations. To alter garbage collection behavior, it adjusts these thresholds with gc.set_threshold(1000, 20, 30), affecting the collection frequency.

Finally, the script enables garbage collection debugging using gc.set_debug(True). This feature provides detailed logs of garbage collection events, aiding in understanding and optimizing Python’s memory management.

The script demonstrates that performance can be influenced by modifying garbage collection thresholds. For instance, increasing Generation 0’s allocation threshold can decrease garbage collection frequency, potentially speeding up code execution. However, this requires a careful balance to avoid excessive memory usage.

Example Code

import gc

class MyClass:
    def __init__(self, name):
        self.name = name

    def __del__(self):
        print(f"Deleting {self.name}")

# Create some instances of MyClass
obj1 = MyClass("Object 1")
obj2 = MyClass("Object 2")

# Remove references to the objects
obj1 = None
obj2 = None

# Force garbage collection
gc.collect()

In this Python script, the gc module is imported to manage garbage collection. A class named MyClass is defined with an initializer __init__ that assigns a name to each instance, and a destructor __del__ that prints a message upon the deletion of an instance. Two instances of MyClass are created, named “Object 1” and “Object 2”. Subsequently, the references to these objects are removed by setting obj1 and obj2 to None. This action makes them candidates for garbage collection. Finally, the script explicitly invokes the garbage collector using gc.collect(), which triggers the deletion of these now unreferenced objects and results in the execution of their destructors, printing the deletion messages.

Example Code 02

import gc
import sys
import time

gc.set_debug(True)

class MyLink:
    
    def __init__(self, next_link, value):
        self.next_link = next_link
        self.value = value
        
    def __repr__(self):
        return self.value
    
l = MyLink(next_link=None, value='Main Link')

my_list = []

start = time.perf_counter()
for i in range(5000000):
    l_temp = MyLink(l, value='L')
    my_list.append(l_temp)
end = time.perf_counter()

print(end-start)

In this Python script, the gc (garbage collection), sys, and time modules are imported to demonstrate the efficiency of object creation and garbage collection in Python. The script activates the garbage collection debugging mode with gc.set_debug(True) to provide detailed information during garbage collection. A class MyLink is defined with an initializer to set up a chain of links and a custom string representation. The script then creates an initial MyLink object named ‘Main Link’. Following this, a loop runs five million times, each time creating a temporary MyLink object linked to the ‘Main Link’ and appending it to a list my_list. This process simulates the creation of a large number of objects in memory. The script measures and prints the time taken for this extensive operation using time.perf_counter(), showcasing the time efficiency of object creation and reference in Python’s memory management system.

Conclusion

Understanding garbage collection in Python can significantly benefit the optimization of code performance. By effectively managing memory and tweaking garbage collection behavior, you can enhance the efficiency of Python applications. The key involves balancing memory usage with performance needs.

Did you enjoy this deep dive into Python’s garbage collection? If so, hit the like button and leave a comment with your thoughts.


Discover more from teguhteja.id

Subscribe to get the latest posts sent to your email.

Tags:

Leave a Reply

Optimized by Optimole
WP Twitter Auto Publish Powered By : XYZScripts.com

Discover more from teguhteja.id

Subscribe now to keep reading and get access to the full archive.

Continue reading