Python packages are the fundamental unit of shareable code in Python. Packages make it easy to reuse and maintain your code, and are how you share your code with your colleagues and the wider Python community. Python Packages is an open source textbook that describes modern and efficient workflows for creating Python packages. The focus of this text is overwhelmingly practical; we will demonstrate methods and tools you can use to develop and maintain packages quickly, reproducibly, and with as much automation as possible - so you can focus on writing and sharing code!
Python packages are a core element of the Python programming language and are how you write reuseable and shareable code in Python. Despite their importance, packages can be difficult to understand and cumbersome to create for beginners and seasoned developers alike. This book aims to describe the packaging process at an accessible level for data scientists, developers and hobbyist programmers. Along the way, we'll walk through the development of a real Python package and we will explore all the key elements of Python packaging, including; package structure, when and why to write tests and documentation, and how to maintain and update your package with the help of automated CI/CD pipelines.
By reading this book, you will:
Chapter 1: {ref}01:Introduction
first gives a brief introduction to packages in Python and why you should know how to make them. Chapter 2: {ref}02:System-setup
describes how to set up your development environment to develop packages and follow along with the remainder of the book. In Chapter 3: {ref}03:How-to-package-a-Python
, we will develop a small toy package from end-to-end to get a feel for the steps involved in the packaging process and to understand the final product we are aiming for. The remaining chapters then unpack this workflow and go into more details about each step in the packaging process, organised roughly in their order in the workflow:
04:Package-structure-and-state
05:Testing
06:Documentation
07:Releasing-and-versioning
08:Continuous-integration-and-deployment
A1:Packages-with-a-command-line-interface
A2:Python-packaging-cheat-sheet
Throughout this book we use foo()
to refer to functions, and bar
to refer to variables and function parameters.
Larger code snippets in the text appear as below and we will use type hints and Numpy-style for docstrings:
def palindrome(word: str) -> str:
"""Turns a word into a palindrome.
Parameters
----------
word : str
Word to turn into a palindrome.
Returns
-------
str
Palindrome of word.
Examples
--------
>>> palindrome("Tomas")
'TomassamoT'
>>> palindrome("Python")
'PythonnohtyP'
"""
return word + word[::-1]
Code executed in the interactive Python interpreter to produce an output will appear as below:
a = 1
b = 2
a + b
If you have an electronic version of the book, e.g., https://py-pkgs.org, all code is rendered in such a way that you can easily copy and paste directly from your browser to your Python interpreter or editor.
The Python software ecosystem is constantly evolving. While the packaging workflows and concepts discussed in this book are effectively tool-agnostic, the tools we do use in the book may have been updated by the time you read it. If the maintainers of these tools are doing the right thing by documenting, versioning, and properly deprecating their code (we'll explore these concepts ourselves in Chapter 7: {ref}07:Releasing-and-versioning
), then it should be straightforward to adapt any outdated code in the book.
This book was written in JupyterLab, compiled using Jupyter Book, is hosted on GitHub, and the online version is deployed with Netlify. The complete source is available from GitHub, and is automatically updated after edits by GitHub Actions.
We'd like to thank everyone that has contributed to the development of Python Packages. This is an open source textbook that began as supplementary material for the University of British Columbia's Master of Data Science program and was subsequently developed openly on GitHub where it has been read, revised, and supported by many students, educators, practioners and hobbyists. Without you all, this book wouldn’t be nearly as good as it is, and we are deeply grateful. A special thanks to those who directly contributed to the text via GitHub (in alphabetical order): @Carreau
, @dcslagel
.
We would also like to acknowledge the software used to develop this book. This book was written in JupyterLab, compiled using Jupyter Book, is hosted on GitHub, and the online version is deployed with Netlify.