Introduction
Welcome back, everyone! In today’s post, the intricacies of garbage collection in Python will be delved into. An influential aspect of Python programming, garbage collection, is responsible for the efficiency of code execution. Its basic principles, implementation in Python, and ways to optimize code performance through garbage collection will be explored.
Garbage Collection Python
The core of garbage collection in Python lies in managing memory usage by discarding objects that are no longer needed. This process depends on reference counting. Python associates a reference count with each object, indicating how many references point to it. When this count drops to zero, showing no references to the object, Python’s garbage collector can delete the object, thereby freeing up memory.
Practical Implementation
These concepts in Python will be experimented with. The manipulation and observation of garbage collection can be achieved using the sys
and gc
(garbage collection) modules. For instance, the reference count of an object is revealed by sys.getrefcount()
, and what references an object is shown by gc.get_referents()
.
Running in Terminal
Python 3.11.5 (tags/v3.11.5:cce6ba9, Aug 24 2023, 14:38:34) [MSC v.1936 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import gc
>>>
>>> a = 'Hello World'
>>> sys.getrefcount(a)
2
>>> my_list = []
>>> my_list.append(a)
>>> sys.getrefcount(my_list)
2
>>> sys.getrefcount(a)
3
>>> gc.get_referrers(a)
[['Hello World'], {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'sys': <module 'sys' (built-in)>, 'gc': <module 'gc' (built-in)>, 'a': 'Hello World', 'my_list': ['Hello World']}]
>>> gc.get_threshold()
(700, 10, 10)
>>> gc.set_threshold(1000, 20, 30)
>>> gc.get_threshold()
(1000, 20, 30)
>>> gc.get_count()
(629, 5, 1)
>>> gc.set_debug(True)
Description Code
The Python script, running on Python 3.11.5 in a Windows 32-bit environment, focuses on memory management and garbage collection. It begins by importing two essential modules: sys
for system operations and gc
for garbage collection.
The script’s primary function is to explore Python’s memory management by creating a string a
with the value ‘Hello World’ and examining its reference count using sys.getrefcount(a)
. Initially, the count is 2, indicating two references to a
. When a
is appended to a newly created list my_list
, its reference count increases to 3, as my_list
now holds a reference to a
.
To understand memory retention, the script uses gc.get_referrers(a)
to list objects referencing a
. This offers insight into the interconnectivity and retention of objects in memory.
Attention then shifts to garbage collection thresholds. The script fetches the current thresholds (700, 10, 10) using gc.get_threshold()
, representing the thresholds for Python’s three garbage collection generations. To alter garbage collection behavior, it adjusts these thresholds with gc.set_threshold(1000, 20, 30)
, affecting the collection frequency.
Finally, the script enables garbage collection debugging using gc.set_debug(True)
. This feature provides detailed logs of garbage collection events, aiding in understanding and optimizing Python’s memory management.
The script demonstrates that performance can be influenced by modifying garbage collection thresholds. For instance, increasing Generation 0’s allocation threshold can decrease garbage collection frequency, potentially speeding up code execution. However, this requires a careful balance to avoid excessive memory usage.
Example Code
import gc
class MyClass:
def __init__(self, name):
self.name = name
def __del__(self):
print(f"Deleting {self.name}")
# Create some instances of MyClass
obj1 = MyClass("Object 1")
obj2 = MyClass("Object 2")
# Remove references to the objects
obj1 = None
obj2 = None
# Force garbage collection
gc.collect()
In this Python script, the gc
module is imported to manage garbage collection. A class named MyClass
is defined with an initializer __init__
that assigns a name to each instance, and a destructor __del__
that prints a message upon the deletion of an instance. Two instances of MyClass
are created, named “Object 1” and “Object 2”. Subsequently, the references to these objects are removed by setting obj1
and obj2
to None
. This action makes them candidates for garbage collection. Finally, the script explicitly invokes the garbage collector using gc.collect()
, which triggers the deletion of these now unreferenced objects and results in the execution of their destructors, printing the deletion messages.
Example Code 02
import gc
import sys
import time
gc.set_debug(True)
class MyLink:
def __init__(self, next_link, value):
self.next_link = next_link
self.value = value
def __repr__(self):
return self.value
l = MyLink(next_link=None, value='Main Link')
my_list = []
start = time.perf_counter()
for i in range(5000000):
l_temp = MyLink(l, value='L')
my_list.append(l_temp)
end = time.perf_counter()
print(end-start)
In this Python script, the gc
(garbage collection), sys
, and time
modules are imported to demonstrate the efficiency of object creation and garbage collection in Python. The script activates the garbage collection debugging mode with gc.set_debug(True)
to provide detailed information during garbage collection. A class MyLink
is defined with an initializer to set up a chain of links and a custom string representation. The script then creates an initial MyLink
object named ‘Main Link’. Following this, a loop runs five million times, each time creating a temporary MyLink
object linked to the ‘Main Link’ and appending it to a list my_list
. This process simulates the creation of a large number of objects in memory. The script measures and prints the time taken for this extensive operation using time.perf_counter()
, showcasing the time efficiency of object creation and reference in Python’s memory management system.
Conclusion
Understanding garbage collection in Python can significantly benefit the optimization of code performance. By effectively managing memory and tweaking garbage collection behavior, you can enhance the efficiency of Python applications. The key involves balancing memory usage with performance needs.
Did you enjoy this deep dive into Python’s garbage collection? If so, hit the like button and leave a comment with your thoughts.
Discover more from teguhteja.id
Subscribe to get the latest posts sent to your email.