Making Packages
The tstools
package
We have started the process of turning the initial script into a package in the
analysis1
directory. The structure of this directory is the following:analysis1/ tstools/ moments.py vis.py data/ brownian.csv tests.py
The directory
tstools
is our Python package, it contains two files
moments.py
and vis.py
that we will use to implement the functions we need
for our analysis. The test data is in the data
directory and the tests.py
file contains the tests we will use to check that our functions are working as
expected.It is possible to import functions from the specific modules containing them:
import numpy as np import tstools.moments from tstools.vis import plot_histogram timeseries = np.genfromtxt("./data/brownian.csv", delimiter=",") mean, var = tstools.moments.get_mean_and_var(timeseries) fig, ax = tstools.vis.plot_histogram(timeseries)
Let's try to import the package as a whole:
# compute-mean.py import numpy as np import tstools timeseries = np.genfromtxt("./data/brownian.csv", delimiter=",") mean = tstools.moments.get_mean_and_var(timeseries)
$ python compute_mean.py Traceback (most recent call last): File "<stdin>", line 4, in <module> AttributeError: module 'tstools' has no attribute 'moments'
What went wrong?
In the following section we add a couple of
import
statements in
the __init__.py
so that all our functions (in both modules) are
available under the single namespae tstools
.init dot pie
Whenever you import a directory, Python will look for a file
__init__.py
at the root of this
directory, and, if found, will import it.
It is the presence of this initialization file that truly makes the tstools
directory a Python
package.Since Python 3.3, this isn't technically true. Directories without a init.py
file are called namespace packages, see Packaging namespace packages on the
Python Packaging User Guide). However, their discussion is beyond the scope of
this course.
As a first example, let's add the following code to
__init__.py
:# tstools/__init__.py from os.path import basename filename = basename(__file__) print(f"Hello from {filename}")
If we now import the
tstools
package:import tstools print(tstools.filename)
Hello from __init__.py __init__.py
The lesson here is that any object (variable, function, class) defined in the
__init__.py
file is available under the package's namespace.Bringing all functions under a single namespace
Our package isn't very big, and the internal strucure with 2 different modules isn't
very relevant for a user.
Write the
__init__.py
so that all functions defined in
modules vis.py
and moments.py
are accessible directly
under the tstools
namespace, that isimport tstools # instead of mean, var = tstools.moments.get_mean_and_var(...) mean, var = tstools.get_mean_and_var(timeseries) # instead of fig, ax = tstools.vis.plot_histogram(...) fig, ax = tstools.plot_histogram(timeseries, 4*np.sqrt(var))
By default python looks for modules in the current directory
and some other locations (more about that later). When using
import
,
you can refer to modules in the current package using the dot notation:# import something from module that resides # in the current package (next to the __init__.py) from .module import something
Using the package
Our package is now ready to be used in our analysis, and an analysis scripts could look like this:
# analysis1/analysis1.py import numpy as np import matplotlib.pyplot as plt import tstools timeseries = np.genfromtxt("./data/brownian.csv", delimiter=",") mean, var = tstools.get_mean_and_var(timeseries) fig, ax = tstools.plot_histogram(timeseries, nbins=100)
Note that the above does the exact same amount of work job as
initial_scripts/analysis.py
... but is much shorter and easier to read!