When to use classes? When your functions take the same arguments

Are you having trouble figuring out when to use classes or how to organize them?

Have you repeatedly searched for “when to use classes in Python”,
read all the articles and watched all the talks,
and still don’t know whether you should be using classes in any given situation?

Have you read discussions about it that for all you know may be right,
but they’re so academic you can’t parse the jargon?

Have you read articles that all treat the “obvious” cases,
leaving you with no clear answer when you try to apply them to your own code?


My experience is that, unfortunately,
the best way to learn this is to look at lots of examples.

Most guidelines tend to either be too vague if you don’t already know enough about the subject,
or too specific and saying things you already know.

This is one of those things that once you get it seems obvious and intuitive,
but it’s not, and is quite difficult to explain properly.


So, instead of prescribing a general approach,
let’s look at:

  • one specific case where you may want to use classes
  • examples from real-world code
  • some considerations you should keep in mind

The heuristic

If you have functions that take the same set of arguments, consider using a class.

That’s it.

In its most basic form,
a class is when you group data with functions that operate on that data;
it doesn’t have to represent a real (“business”) object,
it can be an abstract object that exists only
to make things easier to use / understand.

Note

As Wikipedia puts it,
“A heuristic is a practical way to solve a problem.
It is better than chance, but does not always work.
A person develops a heuristic by using
intelligence, experience, and common sense.”

So, this is not the correct thing to do all the time,
or even most of the time.

Instead, I hope that this and other heuristics
can help build the right intuition
for people on their way from
“I know the class syntax, now what?” to
“proper” object-oriented design.

Example: HighlightedString

My feed reader library supports searching articles.
The results include article snippets,
and which parts of the snippet actually matched.

To highlight the matches (say, on a web page),
we write a function that takes a string and a list of slices,
and adds before/after markers to the parts inside the slices:

>>> value = 'water on mars'
>>> highlights = [slice(9, 13)]
>>> apply_highlights(value, highlights, '', '')
'water on mars'

While writing it,
we pull part of the logic into a helper
that splits the string such that highlights always have odd indices.
We don’t have to, but it’s easier to reason about problems one at a time.

>>> list(split_highlights(value, highlights))
['water on ', 'mars', '']

To make things easier,
we only allow non-overlapping slices
with positive start/stop and no step.
We pull this logic into another function
that raises an exception for bad slices.

>>> validate_highlights(value, highlights)  # no exception
>>> validate_highlights(value, [slice(6, 10), slice(9, 13)])
Traceback (most recent call last):
  ...
ValueError: highlights must not overlap: slice(6, 10, None), slice(9, 13, None)

Quiz: Which function should call validate_highlights()? Both? The user?


Instead of separate functions, we can write a HighlightedString class with:

  • value and highlights as attributes
  • apply() and split() as methods
  • the validation happening in __init__
>>> string = HighlightedString('water on mars', [slice(9, 13)])
>>> string.value
'water on mars'
>>> string.highlights
(slice(9, 13, None),)
>>>
>>> string.apply('', '')
'water on mars'
>>> list(string.split())
['water on ', 'mars', '']
>>>
>>> HighlightedString('water on mars', [slice(13, 9)])
Traceback (most recent call last):
  ...
ValueError: invalid highlight: start must be not be greater than stop: slice(13, 9, None)

This essentially bundles data and behavior.

You may ask:
I can do any number of things with a string and some slices,
why this behavior specifically?
Because, in this context,
this behavior is generally useful.

Besides being shorter to use, a class:

  • shows intent:
    this isn’t just a string and some slices,
    it’s a highlighted string
  • makes it easier to discover what actions are possible
    (help(), code completion)
  • makes code cleaner;
    __init__ validation ensures invalid objects cannot exist;
    thus, the methods don’t have to validate anything themselves

Caveat: attribute changes are confusing

Let’s say we pass a highlighted string to a function
that writes the results in a text file,
and after that we do some other stuff with it.

What would you think if this happened?

>>> string.apply('', '')
'water on mars'
>>> render_results_page('output.txt', titles=[string])
>>> string.apply('', '')
'water on mars'

You may think it’s quite unexpected; I know I would.
Either intentionally or by mistake,
render_results_page() seems to have changed our highlights,
when it was supposed to just render the results.

That’s OK, mistakes happen.
But how can we prevent it from happening in the future?

Solution: make the class immutable

Well, in the real implementation, this mistake can’t happen.

HighlightedString is a frozen dataclass,
so its attributes are read-only;
also, highlights is stored as a tuple,
which is immutable as well:

>>> string.highlights = [slice(0, 5)]
Traceback (most recent call last):
  ...
dataclasses.FrozenInstanceError: cannot assign to field 'highlights'
>>> string.highlights[:] = [slice(0, 5)]
Traceback (most recent call last):
  ...
TypeError: 'tuple' object does not support item assignment

You can find this pattern in werkzeug.datastructures,
which contains HTTP-flavored subclasses of common Python objects.
For example, Accept is an immutable list:

>>> accept = Accept([('image/png', 1)])
>>> accept[0]
('image/png', 1)
>>> accept.append(('image/gif', 1))
Traceback (most recent call last):
  ...
TypeError: 'Accept' objects are immutable

Try it out

If you’re doing something and you think you need a class,
do it and see how it looks.
If you think it’s better, keep it,
otherwise, revert the change.
You can always switch in either direction later.

If you got it right the first time, great!
If not, by having to fix it you’ll learn something,
and next time you’ll know better.

Also, don’t beat yourself up.

Sure, there are nice libraries out there
that use classes in just the right way,
after spending lots of time to find the right abstraction.
But abstraction is difficult and time consuming,
and in everyday code good enough is just that – good enough –
you don’t need to go to the extreme.

Note

Update:
I wrote an article about exceptions to this heuristic
(that is, when functions with the same arguments
don’t necessarily make a class).


That’s it for now.

Learned something new today? Share this with others, it really helps!


This is part of a series:

Read More

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.