Code Quality

1. Document your code

Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do - Donald Knuth, literate programming

There are multiple ways you can document your code. Below are three examples:

Docstrings

Docstrings are the first statement in a module, function, class or method so programmers can understand what it does without having to read the details of the implementation.

Docstrings are string literals so must be contained within single quote marks (for single line docstrings) or triple quotes (for multiline docstrings). See the example below for a function-level docstring.

 def calc_bulk_density(mass,volume):
     "Return dry bulk density = powder mass / powder volume."
     return mass / volume

Docstrings are preferred over in-line comments (see below) as the docstrings can be easily accessed using the Python help() function. It is also possible to generate online documentation automatically from docstrings.

In-line comments

# bulk density is the powder mass / powder volume
density = mass / volume 

Markdown in a Jupyter Notebook

For more extensive discussion you can combine code and text in a single document. See this tutorial for more information about using Markdown in a Jupyter Notebook.

2. Focus on readability

Your code should be easily readable by others. This is a big topic! The Pep 8 Style Guide for Python code has further guidance, although it is a daunting document. The most important thing is that you are consistent within your own code.

Consistency is key

Code formatting (for example, brackets) and use of whitespace should be consistent. For example, do not mix-and-match whitespace as in the code below:

spam(ham[1], {eggs: 2})   
spam( ham[ 1 ], { eggs: 2} )

You should also avoid mixing data types where possible. For example, using a 2-dimensional Numpy array and a 1-dimensional Numpy array within a simulation would usually be better than using a 2-dimensional Numpy array and a 1-dimensional Python list.

Variable and function names

Use clear, meaningful variable and function names - don’t just use x, p and expect the reader to know what they mean! For example angular_momentum is a better variable name than omega.

Clear code structure

Import all of the libraries used at the top of your code. Also define any constants that will not change during your simulation (for example, the radius of the earth) at the top of your code.

Use Markdown to write section headings in a Jupyter Notebook. You can also use blank lines to split code into logical blocks. Split long lines of your code using a \ at the end of the line(s). For example:

print("this is a really really long line of code \
that I'd like split over two lines")

3. Avoid duplication

Duplication of code should be avoided where possible. There are several ways this can be achieved.

Write functions

If you will re-use a block of code multiple times consider encapsulating it in a function. See this tutorial for information about writing functions.

Use external libraries

Use appropriate functions and data-types, including those from external libraries. For example, if you need to perform mathematical operations on an array of values, use Numpy arrays instead of Python lists.

Use control structures when appropriate

Use control structures appropriately. Only use if, while or for loops when necessary.

4. Think about reproducibility

Writing reproducible code is difficult. In fact, there are many interesting initiatives designed to improve reproducibility in the computational scientists, such as Reprohacks.

One straight-forward thing you can do is print the version number for each package you import using print(packagename.__version__)