This is a collection of eccentricities and inconsistencies I have run into in Python, mostly in the course of writing Hypothesis.
Most of them aren't terrible and are merely weird or annoying, and most languages have warts like this. Python feels like it has an atypically high rate of them but maybe I've just looked a lot more closely.
All of these examples are Python 2, with notes where they no longer apply in Python 3. Python 3 has a few new eccentricities of its own but is mostly a bit better in this regard.
These are in no particular order except the one I remembered them in.
x = float('nan')
x == x
So far so "good". This is the normal IEEE behaviour for floats.
float('nan') in [float('nan')]
Expected behaviour still. There is no value in the list equal to float('nan')!
x in [x]
The defined behaviour of contains for lists and tuples is that x in y if any(x is s or x == s for s in y). This is documented in Python 3 but is true (and contrary to the documentation in Python 2.
Note that you will get slightly different behaviour in pypy because float('nan') is float('nan') is always True in pypy. Also there is a bug in some versions where x in [x] will return False but x in [x, "hi"] will return True.
x = [1]
y = x
x += [2]
x
y
x = (1,)
y = x
x += (2,)
x
y
+= is supported for everything that has + defined, but you can override __iadd__ in which case x += y is equivalent to x = x.__iadd__(y). For lists, iadd mutates x in place.
This is particularly fun when you put mutable types inside immutable ones:
x = ([],)
x[0] += [1]
As you'd hope, we got an error on item assignment to a tuple! But...
x[0]
Because the assignment mutated it in place, even though we were not able to update the value in the index it "worked" anyway.
import sys
sys.maxint
type(sys.maxint)
(sys.maxint + 1) - 1
type((sys.maxint + 1) - 1)
This one is of course gone in Python 3 because the int/long distinction is gone.
0 == 1 is False
(0 == 1) is False
0 == (1 is False)
The above is part of operator chaining, It's interpreted as (0 == 1) and (1 is False)
object.mro()
str.mro()
type.mro()
Oh right. This could be a type. So clearly what we should have done is...
type.mro(type)
type.mro(object)
type.mro(str)
Looks good, right?
class Foo(object):
def foo(self):
return "Hi"
class AddsFoo(type):
def mro(self):
return super(AddsFoo, self).mro() + [Foo]
class Bar(object):
__metaclass__ = AddsFoo
Bar().foo()
Bar.mro()
type.mro(Bar)
type(Bar).mro(Bar)
It is incorrect to call type.mro(x) because x might have a custom metaclass. The only correct way to look up the mro of an arbitrary python type is type(x).mro(x).
Here's a problem I had recently. In 2.7 __repr__ must return only ascii strings. It's OK to return a unicode string, but it will be interpreted as ascii.
class CustomRepr(object):
def __init__(self, rep):
self.rep = rep
def __repr__(self):
return self.rep
CustomRepr("foo")
CustomRepr(u"foo")
CustomRepr(u"☃")
It is annoyingly common to get this wrong. Suppose I wanted to display an object that gets it wrong? I am perfectly able to display the unicode, but repr() won't let me.
print(CustomRepr("☃").__repr__())
print(object().__repr__())
print(object.__repr__())
This has the same resolution as the mro problem.
def safe_repr(x):
return type(x).__repr__(x)
"But David!" you say. "What if __repr__ is assigned on the instance rather than the class?
class NoCustomRepr(object):
pass
x = NoCustomRepr()
x.__repr__ = "Hi"
x
So far so good.
This workaround is no longer needed in Python 3, but if it were the behaviour would be the same.
class HasAProperty(object):
@property
def stuff(self):
return self.oops_does_not_exist
hasattr(HasAProperty, 'stuff')
hasattr(HasAProperty(), 'stuff')
hasattr works by calling the property and seeing if you get an AttributeError. It doesn't care where the AttributeError comes from.
class BonusList(list):
def __iter__(self):
for x in super(BonusList, self).__iter__():
yield x
yield 1
BonusList()
list(BonusList())
tuple(BonusList())
tuple(iter(BonusList()))
This is the sort of inconsistency you can get from C extensions using the concrete methods rather than the abstract protocol. Because a lot of Python builtins are implemented in C, you can run into this sort of thing.
Basically the only safe thing to do is never override methods of builtins.
This behaviour is the same on Python 3 but correct on pypy because pypy has an actually sensible implementation.
Class bodies act a lot like normal scopes. For example
class Hello(object):
x = "world"
print("Hello %s" % (x,))
class A(object):
a = 10
b = range(a)
c = [x for x in b]
d = list(x for x in c)
e = [d[i] for i in range(a)]
f = list(e[i] for i in range(a))
For some reason e was not in scope there even though all the previous ones are. This would have worked fine if we were in a method rather than class body:
def A():
a = 10
b = range(a)
c = [x for x in b]
d = list(x for x in c)
e = [d[i] for i in range(a)]
f = list(e[i] for i in range(a))
A()
The reason for this is that Python distinguishes between scopes and execution frames. The latter are what are captured by function definitions. After a class body has finished executing, the execution frame is discarded, which is why class level variables are not visible in function definitions. What is in scope in a generator body is complicated: The top level collection being iterated over is received from the scope, but the body of the generator must capture the execution frame.
Python is very lax about equality of different numericish types
{0: set([1])} == {False: frozenset([True])} == {0.0: set([1.0])}
Note in particular that a frozenset is equal to a set with the same (or equivalent) contents. This is different from the tuple/list behaviour:
[] == ()
import gc
from weakref import WeakKeyDictionary
class Key(object):
pass
def run(cache, template):
cache[template] = 1
# Note: Not raising the ValueError causes this to work. Any exception will
# cause the same problem though.
raise ValueError()
def test_gc():
cache = WeakKeyDictionary()
# Extracting this whole try/except into its own function and passing in
# cache as an argument causes this to work.
try:
# Note: Inlining run here causes this to work
run(
cache, Key()
)
except ValueError:
pass
gc.collect()
# The Key() argument went out of scope immediately. When we ran the GC
# it should definitely have had the weakref to it cleared.
assert not cache
test_gc()
The reason this doesn't work is that the exception stack trace carries frame objects that reference local variables, so until that gets cleared the keys remain in scope. You can fix this by calling sys.exc_clear() before the gc.collect().
Note that this problem is not present in Python 3: exc_info is automatically cleared when you exit the exception handler.
{1} <= {2}
{1} >= {2}
The ordering relation on sets is defined to be subset inclusion, which is only a partial order rather than a total order. In particular this means that (x < y) is not the same as not (y <= x):
{1} < {2}
Note that this means that sorting lists of sets will not work correctly:
sorted([{-1, 1}, {0}, {1}])
{1} < {-1, 1}
Because sets are not obeying the contract of ordering, they do not sort correctly and you get an allegedly sorted array with the final element strictly less than the first.