Let me introduce: __slots__

The featherweight version of a Python class
03 September 2017 6465 words

Hey there!

Today I'd like to talk about __slots__.

The inspiration for this article comes from a blog post about Python data structures published by Dan Bader and the small iteration we then had on this gist to check their performances.

For all the examples you are going to see I am using Python 3.6.2.

So what are slots? __slots__ are a different way to define the attributes storage for classes in Python.

If this is not clear bear with me.

In [1]:
# use getsizeof to get the size of our objects
from sys import getsizeof
from sys import version as python_version
print(python_version)
3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
In [2]:
class PythonClass():
    """This is a simple Python class"""
    
    def __init__(self, message):
        """Init method, nothing special here"""
        self.message = message
        self.capital_message = self.make_it_bigger()
    
    def make_it_bigger(self):
        """Do something with your attributes"""
        return self.message.upper()
    
    def scream_message(self):
        """Print the capital_message attribute of the instance"""
        print(self.capital_message) 
        
my_instance = PythonClass("my message")

So we have a class, PythonClass, and 1 instance of this class, my_instance.

Where are message and capital_message stored?

Python uses a special attribute called dict to store the instance's attributes:

In [3]:
[element for element in dir(my_instance) if element == '__dict__']
Out[3]:
['__dict__']
In [4]:
my_instance.__dict__
Out[4]:
{'capital_message': 'MY MESSAGE', 'message': 'my message'}
In [5]:
my_instance.new_message = "This is a new message"
In [6]:
my_instance.__dict__
Out[6]:
{'capital_message': 'MY MESSAGE',
 'message': 'my message',
 'new_message': 'This is a new message'}
In [7]:
my_instance.__dict__['another_new_message'] = "Yet a new message"
In [8]:
my_instance.__dict__
Out[8]:
{'another_new_message': 'Yet a new message',
 'capital_message': 'MY MESSAGE',
 'message': 'my message',
 'new_message': 'This is a new message'}

As you can see, I can add new attributes to my_instance using either the my_instance.name_of_the_attribute notation, or the my_instance.__dict__['name_of_the_attribute'] notation.

We can therefore say that for normal Python classes, a dict is used to store the instance's attributes.

Is this bad or good?

Well, this is neither bad nor good, dicts are awesome, but not perfect, because there is always a trade-off.

With a dict you have a consistent lookup time, so the access time is more or less O(1) (it doesn't depend on the size of the dictionary), but because it's a mutable object and it can grow, it's a lot heavier (it has to allocate space for this).

Let's look at the __slots__ now.

In [9]:
class PythonClassWithSlots():
    """This is a simple Python class"""
    
    __slots__ = ["message", "capital_message"]
    
    def __init__(self, message):
        """Init method, nothing special here"""
        self.message = message
        self.capital_message = self.make_it_bigger() 
        
    def make_it_bigger(self):
        """Print the message attribute of the instance"""
        return self.message.upper()
    
    def scream_message(self):
        """Print the message attribute of the instance"""
        print(self.capital_message) 
        
my_instance = PythonClassWithSlots("my message")
In [10]:
[element for element in dir(my_instance) if element == '__dict__']
Out[10]:
[]

So we don't have an attribute called __dict__ inside our instance.

But we have a new attribute called __slots__.

In [11]:
[element for element in dir(my_instance) if element == '__slots__']
Out[11]:
['__slots__']

Can we access our attributes as we do with normal classes?

Indeed, we can.

In [12]:
my_instance.message
Out[12]:
'my message'
In [13]:
my_instance.capital_message
Out[13]:
'MY MESSAGE'

Can we add new attributes?

In [14]:
my_instance.new_message = "This is a new message"
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-4c493294e963> in <module>()
----> 1 my_instance.new_message = "This is a new message"

AttributeError: 'PythonClassWithSlots' object has no attribute 'new_message'

No, we can't.

But we can use the attributes that we defined during the class declaration inside __slots__.

In [15]:
my_instance.message = "Just putting something new here"

But why would you need to use slots when you have a dict?

Well the answer is that __slots__ are a lot lighter and slightly faster.

Are slots-based classes lighter than normal classes?

The answer should be yes, but getting the size of an object is not that easy.

In [16]:
my_instance_without_slots = PythonClass("my message")
my_instance_with_slots = PythonClassWithSlots("my message")
In [17]:
getsizeof(my_instance_without_slots_
Out[17]:
56
In [18]:
getsizeof(my_instance_with_slots)
Out[18]:
56

mmm.....but normal classes should be heavier, shouldn't they?

With getsizeof we get the size in bytes of our object but not of all the other referenced objects. So in our case it should be calculated in the following way:

In [19]:
getsizeof(my_instance_without_slots.__dict__), getsizeof(my_instance_without_slots)
Out[19]:
(112, 56)

Now it makes a lot more sense.

In [20]:
my_instance_without_slots.new_attribute_1 = "This is a new attribute"
getsizeof(my_instance_without_slots.__dict__), getsizeof(my_instance_without_slots)
Out[20]:
(240, 56)

As you can see, the size of __dict__ changes when we add new elements.

In [1]:
len(my_instance_without_slots.__dict__)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-782f32271978> in <module>()
----> 1 len(my_instance_without_slots.__dict__)

NameError: name 'my_instance_without_slots' is not defined
In [22]:
getsizeof({k:v for k,v in enumerate(range(3))})
Out[22]:
240

A normal dict, with the same number of elements, will be the same size.

What if we add 10 new elements?

In [23]:
for i in range(10): my_instance_without_slots.__dict__[i] = str(i) 
getsizeof(my_instance_without_slots.__dict__), getsizeof(my_instance_without_slots)
Out[23]:
(648, 56)

Let's go further with our analysis of __slots__, and compare them with a normal class in a little experiment.

In this example we import a json object (think about an api call) using both a normal class and a class with __slots__

In [24]:
import json

my_json = '''{
    "username": "use@python3.org",
    "country": "Poland", "website":
    "www.chrisbarra.xzy",
    "date": "2017/08/15",
    "uid": 1, "gender": "Male"
}'''
In [25]:
class MyUserWithSlots():
    """A kind of user object"""
    
    __slots__ = ('username', 'country', 'website', 'date')
    
    def __init__ (self, username, country, website, date, **kwargs):
        self.username = username
        self.country = country
        self.website = website
        self.date = date

class MyUserWithoutSlots():
    """A kind of user object with slots"""
    
    def __init__ (self, username, country, website, date, **kwargs):
        self.username = username
        self.country = country
        self.website = website
        self.date = date
        
def get_size(instance):
    """
    If instance has __dict__ 
    we add the size of __dict__ 
    to the size of instance.
    
    In this way we correctly consider both
    size of the instance and of __dict__
    """
    size_dict = 0
    
    try:
        size_dict = getsizeof(instance.__dict__)
    except AttributeError:
        pass
    
    return size_dict + getsizeof(instance)
In [26]:
# create 1.000.000 instances
NUM_INSTANCES = 1000000
In [27]:
# create a list with the size of each instance with slots
with_slots = [get_size(MyUserWithSlots(**json.loads(my_json))) for _ in range(NUM_INSTANCES)]

# sum the value inside the list
size_with_slots = sum(with_slots)/1000000

print(f"The total size is {size_with_slots} MB")
The total size is 72.0 MB
In [28]:
# create a list with the size of each instance without slots
without_slots = [get_size(MyUserWithoutSlots(**json.loads(my_json))) for _ in range(NUM_INSTANCES)]

# sum the value inside the list
size_without_slots = sum(without_slots)/1000000

print(f"The total size is {size_without_slots} MB")
The total size is 168.0 MB
In [29]:
size_reduction = ( size_with_slots - size_without_slots ) / size_without_slots * 100
print(f"Memory footprint reduction: {size_reduction:.2f}% ")
Memory footprint reduction: -57.14% 

Wow!

~57% less memory usage thanks to just one line of code.

What about access time?

In [30]:
instance_with_slots = MyUserWithSlots(**json.loads(my_json))
In [31]:
%%timeit
z = instance_with_slots.username
58.2 ns ± 2.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [32]:
instance_without_slots = MyUserWithoutSlots(**json.loads(my_json))
In [33]:
%%timeit
z = instance_without_slots.username
72.4 ns ± 2.67 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

__slots__ are also slightly faster 👍

Want to know more about __slots__? Check the official documentation

Questions for you

  • What do you think about __slots__?
  • Is there a use case where you have found __slots__ extremely useful?

This blog post is a notebook, you can download it from here

Credits

  • the picture is taken from here