1 Abstract¶
The notebook-based test report system provides a way for LSST to generate and publish data-driven reports with an automated system. This technote describes the technical design behind the notebook-based report system.
For usage information, see the nbreport documentation at https://nbreport.lsst.io.
2 Requirements¶
The notebook-based report system is driven by these informally-defined design requirements:
- A document handle corresponds to a templated Jupyter Notebook.
- Notebook instances a generated from templates.
- Each notebook instance has a unique serial number so that a notebook instance can be identified universally from a combination of document handle and instance serial number.
- Once a notebook instance is generated, it is runnable. Notebooks may only be runnable in certain environments where data is available, such as the LSST Science Platform.
- Generated notebook instances are published to the web with LSST the Docs.
- The process of creating a notebook instance, running the notebook, and publishing the notebook instance is completely automated. Manual intervention is only required to develop the notebook template for the document and to set up the automations to run the report generation workflow.
- The notebook-based report system should share as much infrastructure as possible with notebook-based technical notes.
3 Report generation sequence¶
This section traces the generation of a notebook report instance. In doing so, this section identifies the key components of the notebook report system and their designed functionality.
3.1 Compute platform phase¶
Notebook instances are generated on a compute platform that provides access to software libraries and datasets. The LSST Science Platform is envisioned as the baseline compute platform.
Running on the compute platform, the generation of a report instance is coordinated by the nbreport
command-line tool.
nbreport
is a command-line tool so that it can be universally scripted.
The all-in-one command for generating and publishing a notebook report is nbreport issue
.
This section describes the nbreport issue
command.
The nbreport issue
command begins by cloning the requested document repository.
A document repository contains a templated Jupyter notebook.
The template language is Jinja, compatible with Cookiecutter.
Cookiecutter and Jinja have already been adopted by LSST DM for the lsst/templates repository.
The report’s notebook is templated so that both code and documentation content can be updated.
For example, the Python code block for a Butler get
can be templated so that the report corresponds specified dataset.
Before the template is rendered, the report instance is registered with the nbreport API service (POST api.lsst.codes/nbreport/documents/<handle>/instances/
).
Doing so provides a unique, monotonically increasing ID for the report instance.
Registering the report instance now allows the report’s instance ID to be used as a template variable.
This registration step could also be used to preallocate a DOI.
Next, nbreport issue
renders a Jupyter notebook instance using the Jinja/Cookiecutter API.
Besides the instance ID, nbreport issue
also gathers template variables from command-line arguments.
nbreport issue
executes the report instance programatically using nbconvert
‘s ExecutePreprocessor
class.
Doing so allows nbreport
to work as a “headless” service that doesn’t need to open a browser window to execute a Jupyter Notebook.
The executed notebook is saved as an ipynb
file.
Finally, nbreport issue
uploads that ipynb
file to the nbreport web service (POST api.lsst.codes/nbreport/documents/<handle>/instances/<id>
).
All interactions with the web service are authenticated with GitHub-based OAuth, and authorized by GitHub organization membership.
3.2 Web service-based publishing phase¶
Once the api.lsst.codes/nbreport
web service receives the ipynb
file for the report instance, it converts that ipynb
file into an LSST-branded HTML page using nbconvert.
This report web page is similar in appearance to notebook-based technical notes.
The web service uploads the HTML page for the report, in addition to the source ipynb
file, to LSST the Docs.
The notebook-based report system uses LSST the Docs’s editions feature to render each report instance at separate /v/<id>
paths.
The api.lsst.codes/nbreport
web service also generates an index page of all report instances.
This index page is displayed at the report’s main URL.
For example, if the report’s handle is DMQA-001
, the main URL is https://dmqa-001.lsst.io
.
The URL for an individual report instance with an ID of 1
is then https://dmqa-001.lsst.io/v/1
.
At its most basic, the index page provides a chronological listing of reports. The index page may also be developed to enable filtering and search of reports based on template variables.
4 The GitHub repository of a report¶
Each notebook-based report has its own GitHub repository. In the notebook-based report system, the GitHub repository for a report contains the source to generate a report instance. The GitHub repository not contain the instances themselves, those are published and archived with LSST the Docs.
The GitHub repository for a report has several standardized files.
For illustration, the repository for a report with a handle DMQA-001
is laid out like this:
DMQA-001/
├── cookiecutter.json
├── DMQA-001.ipynb
├── nbreport.yaml
└── README.rst
The following sections describe each file.
4.1 cookiecutter.json¶
The cookiecutter.json
file, adopted from the Cookiecutter project, establishes the template context.
This file both defines the template variables that are expected in the report notebook and also defines default values.
A basic cookiecutter.json
file that defines keys cookiecutter.a
and cookiecutter.b
with default values of 0
and 1
, respectively, looks like this:
{
"a": 0,
"b": 1
}
Then a cell in the DMQA-001.ipynb
notebook file can use those template variables using standard Jinja syntax:
answer = {{ cookiecutter.a }} + {{ cookiecutter.b }}
That cell would be rendered, if using the default values, as:
answer = 0 + 1
4.2 DMQA-001.ipynb¶
This file, named after the report’s handle, is a Jupyter notebook. This notebook contains the code and prose that, when executed, becomes a report instance.
This ipynb
file must be committed into the GitHub repository in an unexecuted state, without outputs.
The source of each cell in the ipynb
file is treated as a Jinja template (see 5 Templating of the report notebook).
4.3 nbreport.yaml¶
This file provides configuration for the report within the LSST notebook-based report system.
For the DMQA-001 example report, this file looks like:
handle: DMQA-001
ltd_product: dmqa-001
repository: https://github.com/lsst/DMQA-001
published_url: https://dmqa-001.lsst.io
ipynb: DMQA-001.ipynb
4.4 README.rst¶
The README file describes the report, for users on GitHub.
Note
The report repository should contain a description that can published on the report’s published homepage.
This description could either come from the README or from the nbreport.yaml
file.
5 Templating of the report notebook¶
The Jupyter notebook in a report’s GitHub repository is templated so that it can be customized on-demand when the report is generated.
5.1 Use of Cookiecutter and Jinja¶
Jinja is the templating format. The notebook-based report system uses Cookiecutter as a convenient wrapper around Jinja. Although the notebook-based report system does not use Cookiecutter for its true purpose of instantiating entire file trees, Cookiecutter has these capabilities that can be adapted into the notebook-based report system:
- The
cookiecutter.json
file is useful for defining the full set of template variables, along with their types, and defaults.cookiecutter.json
is also useful for creating secondary variables based on prior variables. - Cookiecutter has a mechanism for running pre- and postprocessing hooks, if necessary.
- Cookiecutter has useful way of registering Jinja extensions.
Cookiecutter’s own command line interface is not used by the notebook-based report system. Instead, cookiecutter’s Python APIs are invoked by the nbreport command-line client. Doing so enables cell-wise templating, as described next. This usage pattern is already used by LSST for the lsst/templates project.
5.2 Cell source templating¶
Rather than interpreting the entire notebook file as a Jinja template, the notebook-based report system is designed so that the source of individual cells is processed as a Jinja template.
This distinction is key because it ensures that the notebook file (ipynb
format) can always be opened, displayed, and authored in the Jupyter notebook viewer or JupyterLab.
Notebook authors simply mark up the Markdown and Python cells with Jinja formatting.
6 nbreport command-line interface¶
nbreport provides a command-line interface that can be used directly, or through automated scripting. nbreport uses the subcommand pattern so that several atomic commands are encapsulated in the same executable. The CLI itself is implemented with Click. This section describes the basic design of this CLI.
6.1 nbreport clone¶
This command clones a report’s repository from GitHub to the local filesystem.
Example:
nbreport clone https://github.com/lsst/DMQA-001
6.2 nbreport init¶
This command initializes a report instance.
Example:
nbreport init DMQA-001
Alternative example that also clones the report repository:
nbreport init https://github.com/lsst/DMQA-001
This command does the following:
Reserves an instance ID for the report.
Instance IDs are managed by the
api.lsst.codes/nbreport
service.Creates a directory named after the report instance.
For example, if the report is
DMQA-001
and the reserved ID is1
, then the report instance is namedDMQA-001-1
. This report instance directory (DMQA-001-1
) is where all notebook computations are carried out. By giving each notebook an isolated directory, the system allows notebooks to create intermediate files in the current working directory without any concern of colliding with other report instances.Copies the nbreport.yaml file into the instance’s directory.
In addition, a field named
instance_id
is added to thenbreport.yaml
file. This allows the nbreport tool to concretely identify the report and its instance.
6.3 nbreport render¶
This command renders the report template from the report repository into a Jupyter Notebook in the instance directory.
Example:
nbreport render DMQA-001-1 -c dataRef=xyz -c paramX=2.9
The -c
options are context overrides — that is, values that replace the template variable defaults set in cookiecutter.json
.
6.4 nbreport compute¶
This command computes the Jupyter notebook in the report instance. It does so in a “headless” manner, without opening a browser window.
nbreport compute DMQA-001-1
6.5 nbreport upload¶
This command uploads the report instance to the api.lsst.codes/nbreport
service, which then publishes the report with LSST the Docs.
nbreport upload DMQA-001-1
Authentication for this command comes from a GitHub token.
6.6 nbreport issue¶
This all-in-one command renders, computes, and uploads a report instance. This command is useful for automated environments.
Example:
nbreport issue https://github.com/lsst/DMQA-001 -c dataRef=xyz
This command carries out the following steps:
- Clones the report repository (like 6.1 nbreport clone).
- Reserves the report instance number (like 6.2 nbreport init).
- Renders the notebook given the provided context variables (like 6.3 nbreport render).
- Computes the notebook (like 6.4 nbreport compute).
- Uploads the computed notebook (like 6.5 nbreport upload).
6.7 nbreport test¶
The nbreport test
command provides a workflow for testing the executability of notebook templates during development.
This command works on an already-cloned report repository (the most common case in development).
Example:
nbreport test DMQA-001
- Creates a test instance directory (
DMQA-001-test
, by default), and clears pre-existing test instance. - Renders the notebook instance using default values from
cookiecutter.json
(by default). It is also possible to pass context overrides on the command line. - Executes and saves the notebook instance for inspection.
The advantage of this workflow for testing is that it creates a local test instance, rather than registering an instance with api.lsst.codes/nbreport
.
6.8 nbreport reproduce¶
The nbreport reproduce
command is used to verify that a report is reproducible.
“Reproducible” in this context means that a published report instance can be re-generated, given the same template context variables, without meaningful variations in the output cells.
Example:
nbreport reproduce https://dmqa-001.lsst.io/v/1
In this example, nbreport reproduce
attempts to regenerate the DMQA-001-1
report instance published at dmqa-001.lsst.io/v/1
.
Note
This command could be implemented by adapting the nbval package.
7 Further reading¶
These are links to related reports, user documentation, or software repositories:
- nbreport documentation
- SQR-026: DMS end-of-night report describes the motivating use-case for nbreport.
- TESTR-001: Characterization Metric Report Demo is an example notebook repository on GitHub. An example report instance generated from it is TESTR-001-1: Characterization Metric Report Demo on LSST the Docs.
- lsst-sqre/nbreport is the GitHub repository for the command-line app.
- lsst-sqre/sqre-uservice-nbreport is the backend microservice for the nbreport system.