While teaching one of my Python classes yesterday I noticed a conditional expression
which can be written in several ways. All of these are equivalent in their behavior:
My preferred style of writing is the last one (not os.path.isdir()) because it
looks the most pythonic of all. However the 5 expressions are slightly different
behind the scenes so they must also have different speed of execution
(click operator for link to documentation):
is - identity operator,
e.g. both arguments are the same object as determined by the
id() function. In CPython
that means both arguments point to the same address in memory
is not -
yields the inverse truth value of is, e.g. both arguments are not the same
object (address) in memory
== - equality
operator, e.g. both arguments have the same value
!= - non-equality
operator, e.g. both arguments have different values
not -
boolean operator
In my initial tweet I mentioned
that I think is False should be the fastest. Kiwi TCMS team
member Zahari countered with not to be the fastest
but didn't provide any reasoning!
My initial reasoning was as follows:
is is essentially comparing addresses in memory so it should be as fast as it gets
== and != should be roughly the same but they do need to "read" values
from memory which would take additional time before the actual comparison of
these values
not is a boolean operator but honestly I have no idea how it is implemented
so I don't have any opinion as to its performance
Using the following performance test script we get the average of 100 repetitions
from executing the conditional statement 1 million times:
Note: in none of these variants the body of the if statement is executed so
the results must be pretty close to how long it takes to calculate the
conditional expression itself!
Results (ordered by speed of execution):
False _______ 0.009309015863109380 - baseline
not result __ 0.011714859132189304 - +25.84%
is False ____ 0.018575656899483876 - +99.54%
is not True _ 0.018815848254598680 - +102.1%
!= True _____ 0.024881873669801280 - +167.2%
== False ____ 0.026119318689452484 - +180.5%
Now these results weren't exactly what I was expecting. I thought not will come in
last but instead it came in first! Although is False came in second it is almost
twice as slow compared to baseline. Why is that ?
After digging around in
CPython I found the following definition
for comparison operators:
Python/ceval.c
where PyObject_RichCompare is defined as follows (definition order reversed
in actual sources):
Objects/object.c
The not operator is defined in Objects/object.c as follows (definition order
reverse in actual sources):
Objects/object.c
So a rough overview of calculating the above expressions is:
not - call 1 function which compares the argument with Py_True/Py_False,
compare its result with 0
is/is not - do a switch/case/break, compare the result to Py_True/Py_False,
call 1 function (Py_INCREF)
==/!= - switch/default (that is evaluate all case conditions before that),
call 1 function (PyObject_RichCompare), which performs couple of
checks and calls another function (do_richcompare), which does a few more checks
before executing switch/case/compare to Py_True/Py_False, call Py_INCREF
and return the result.
Obviously not has the shortest code which needs to be executed.
We can also invoke the dis module, aka disassembler of Python byte code into mnemonics
like so (it needs a function to dissasemble):
From the results below you can see that all expression variants are very similar:
The last 3 instructions are the same (that is the implicit return None of the function).
LOAD_GLOBAL is to "read" the True or False boolean constants and
LOAD_FAST is to "read" the function parameter in this example.
All of them _JUMP_ outside the if statement and the only difference is
which comparison operator is executed (if any in the case of not).
UPDATE 1:
as I was publishing this blog post I read the following comments from
Ammar Askar who also gave me a few pointers on IRC:
Note that this code path also has a direct inlined check for booleans, which should help too: https://t.co/YJ0az3q3qu
— Ammar Askar (@ammar2) December 6, 2019
So go ahead and take a look at
case TARGET(POP_JUMP_IF_TRUE).
UPDATE 2:
After the above comments from Ammar Askar on Twitter and from
Kevin Kofler below I decided to try and change one of the expressions a bit:
that is, calculate the not operation, assign to variable and then evaluate the
conditional statement in an attempt to bypass the built-in compiler optimization.
The dissasembled code looks like this:
The execution time was around 0.022 which is between is and ==. However the
not result operation itself (without assignment) appears to execute for 0.017
which still makes the not operator faster than the is operator, but only just!
Like already pointed out this is a fairly complex topic and it is evident
that not everything can be compared directly in the same context (expression).
P.S.
When I teach Python I try to explain what is going on under the hood. Sometimes
I draw squares on the whiteboard to represent various cells in memory and visualize
things. One of my students asked me how do I know all of this? The essentials
(for any programming language) are always documented in its official documentation.
The rest is hacking around in its source code and learning how it works. This is
also what I expect people working with/for me to be doing!
See you soon and Happy learning!