Effective Python
by Kwanghee Choi
Overview & Disclaimer
This is a summary of the book Effective Python, 1st ed. Python 3.5+ contents are missing, as it is written at 2015, but still has great insights for Python programmers. 2nd ed are out, so this summary will hopefully be updated soon.
1. Pythonic Thinking
- Know which python you’re using
- Difference between 2 and 3 / Runtimes (CPython, PyPy, … )
- Prefer Python 3 over Python 2
- Follow PEP8
- Difference between
bytes
,str
andunicode
- Python 2:
str
(raw),unicode
(Interchangeable for ASCII) - Python 3:
bytes
(raw),str
(unicode) - → (decode), ← (encode)
- Python 2:
- Prefer helper functions over complex expressions
- Prefer blocks than one-liners
- Sequence slicing
- Avoid being verbose (ex. No
x[0:]
orx[:len(x)]
) - Slicing always returns new object
- Avoid using start, end and stride, or use islice
- Avoid being verbose (ex. No
- Use list comprehensions instead of map and filter
- dict, set, list all supports comprehensions
- Avoid using nested comprehensions
- Consider generators for large comprehensions
- Fast execution and low memory, good for chaining
- Make for loops have many information as possible
- Prefer
enumerate()
overrange()
- Use
zip()
(Lazy generator in 3, full result in 2) zip_longest()
can solve truncations
- Prefer
- Avoid using
else
afterfor
/while
- Use all
try
/except
/else
/finally
finally
: cleanupelse
/except
: visually distinguish success / fail case
2. Functions
- Prefer Exceptions over
None
- Returning
None
is error prone because other values also evaluates toFalse
- Returning
- Know variable scope of closures
- Current function → Any enclosing scopes → Global scope → Built-in scope
- Note that assignment and declaration have the same grammar
- Use
nonlocal
for Python 3 and mutable variable for Python 2
- Use
- Be cautious on iterating generators
- Be defensive when iterating over arguments
- Define your own iterable container type with
__iter__
- star args(
*args
) makes generator into list
- Variable positional arguments reduce visual noise
- Keyword arguments can provide optional behavior
- Keyword-only arguments can enforce clarity
- Use keyword arguments’ pop method to simulate in Python 2
- Use
None
and Docstrings to specify dynamic default arguments
3. Classes and Inheritance
- Code has to be dependent on the complexity of the structure
dict
/tuple
→namedtuple
→ Helper classes- Know the limitations of
namedtuple
- Cannot specify default argument values
- Attribute values are accessible using numerical indexes and iteration
- Functions are more simple than classes to put as a parameter (as an interface)
- Use functions if it is stateless
- Use classes with
__call__
(callable class) if it is not stateless
- Consider classes as parameters for generic classes (class method polymorphism)
- Use
@classmethod
to define alternative constructors than__init__
- Use
- Initialize parent classes with super
- Standard method resolution order (mro) fixes initialization order
super()
in Python 3 is same withsuper(__class__, self)
super(__class__)
: Unbound objectsuper(__class__, type)
: Bound object, Bound classmethod / method of objectsuper(__class__, object)
: Bound object, Bound all method
- Use multiple inheritance only with mix-in utility classes
- Mix-in: Classes with no state, only methods
- Override mix-in methods with super() in child classes for pluggable behavior
- Plan to allow subclasses to do more with your internal APIs (in other words, avoid private)
__private
: Avoid naming conflicts with subclasses that are out of your control_protected
: Use documentation to guide subclasses instead of forcing with private
- Inherit directly from Python’s container types for custom container types
- Preserve semantics and functionalities familiar
- Inherit list or dict for simple cases
- Inherit collections.abc to ensure that custom classes match various requirements
4. Metaclasses and Attributes
- Use
@property
,@x.setter
,@x.getter
to give simple public attributes new functionalities- Do not use explicit getter and setter methods
- Modify related object state only in @x.setter methods
@property
methods has to be fast, so if it is too complex, refactor to stop using it
- Use descriptor classes (
__get__
,__set__
) to reuse the behavior and validation of@property
- You have to keep track of instances’ each attributes using WeakKeyDictionary
- If it has the last remaining reference, it automatically empties it
- You have to keep track of instances’ each attributes using WeakKeyDictionary
- Use
__getattr__
,__getattribute__
,__setattr__
for lazy evaluation of attributes__getattr__
only runs once to load an attribute__getattribute__
runs every time when an attribute is accessed__setattr__
runs every time when attributes are assigned on an instance- Avoid infinite recursion in
__getattribute__
and__setattr__
by usingsuper()
- Use metaclass to ensure that subclasses are well formed before objects are constructed
- Python 3: metaclass parameter / Python 2:
__metaclass__
class attribute - Make a metaclass by inheriting from type
- Metaclass receives the contents of associated class in
__new__
- Python 3: metaclass parameter / Python 2:
- Use metaclass to register class existence
- Factory-like methods for modularization needs classes to be registered
- Use metaclass to modify or annotate class attributes
- After class is defined, before class is actually used
5. Concurrency and Parallelism
- Use subprocess to run child processes and manage
stdin
/stdout
/stderr
- Use Thread only for blocking I/O
- Python threads cannot run bytecode in parallel because of GIL
- Use Lock to avoid corruption
- Use Queue for producer-consumer pattern
- Use coroutines to run thousands of functions seemingly at the same time
- yield expression can send values to the exterior code
- Python 3+: can receive values, Python 3.3+:
yield from
, Python 3.6+:async
/await
- Access multiprocessing by concurrent.futures and ProcessPoolExecutor for parallelism
6. Built-in Modules
- Use
functool.wraps
to copy all the important metadata about the decorated function - Use
contextlib
,with
andyield
for reusabletry
/finally
behavior- Lock,
open(x)
, setting logging security level can be simplified withwith
- Lock,
- Use
copyreg
to makepickle
reliable- Adds missing attribute values, allows class versioning, provides stable import path
- Use
datetime
andpytz
instead oftime
for translating different time zonestime
fails for multiple local times
- Use built-in algorithms and data structures
- Data structures:
deque
,OrderedDict
,defaultdict
,heapq
,bisect
itertools
can link iterators together, filter items, return combinations of items
- Data structures:
- Precision:
float
<decimal
<fractions
- Be familiar to Python Package Index (PyPI) and pip
- Python 3.4+ supports pip default installation
7. Collaboration
- Write docstrings for every function, class and module
- Modules: contents and any important classes or functions
- Classes: behavior, important attributes, subclass behavior
- Functions and methods: behavior, every argument, returned value, raised exception
- Use packages to organize modules and provide stable APIs
- Packages provide non-conflicting namespaces
- Control publicly visible names with
__all__
attribute and leading underscore names
- Define a root exception when defining a module’s API
- Know how to break circular dependencies
- Refactor mutual dependencies into a separate module
- Clear division isn’t always possible
- Reordering imports
- Goes against PEP8
- Very unstable when edited, everything can break down very easily
- Provide a configure function
- Enables dependency injection pattern
- Can be difficult to read and write code as heavy restructuring is needed
- Dynamic import
- Simplest solution, as it minimizes refactoring and complexity
- Cost of import is especially bad in tight loops and sets up surprising failures
- Refactor mutual dependencies into a separate module
- Use virtual environments for isolated and reproducible dependencies
- Python 3.4+ supports pyvenv default installation
8. Production
- Consider module-scoped code to configure deployment environments
- Module-scoped: not inside any function or class
- Module contents can be the product of any external condition including
sys
andos
- Consider configuration files used with configparser
- Use
repr
for debuggingrepr
is an unambiguous representation, sox == eval(repr(x))
str
is a human-readable representation
- Test everything with TestCase in unittest
- Unit tests for isolated functionality (dedicated method for each behavior expectations)
- Integration tests for modules that interact (
setUp
,tearDown
methods)
- Use
pdb
andset_trace()
for interactive debugging - Profile and read stats before optimizing
- Prefer
cProfile
(C-extension) overprofile
(Pure Python)
- Prefer
- Use
gc
andtracemalloc
to track memory usage and leaks- Python 3.4+ supports
tracemalloc
- Python 3.4+ supports
Subscribe via RSS