Jupyter Publishing Guide: From Embedded to Book

Jupyter Publishing Header Image

Jupyter Notebooks are the lingua franca for Data Science. There is hardly any data scientist that has never used this IDE. However, in contrast to its popularity and usefulness, the process of sharing a Jupyter Notebook is not straightforward. This article presents a brief survey of the different publishing solutions available for personal and/or commercial projects.

Jupyter Notebooks

Jupyter notebooks are used everywhere in the Data Science ecosystem. They are present without exceptions, from Coursera Courses to projects by FAANG. It is normal then to ask, once a work has been done, how one can conveniently publish the results. The answer may not be satisfactory: "There is no well-known, commonly used, standard way to publish Jupyter Notebooks". Instead, there are co-existing solutions that target different personas:

In this post, a showcase of solutions for the different use cases will be presented to the reader so that one can find the right tool for the job.

A general classification for the solutions could be as follows:

All solutions here assume the only hardware and software available are the ones used to work with the Jupyter Notebooks, i.e. no server or backend. Moreover, every solution presented is either open-source or free to use, both for personal and commercial use cases.

This post will follow a cookbook approach, so feel free to jump ahead to the section of interest, they are all self-contained. Each solution will contain advantages, disadvantages, and a "when to use" section.

Index

(Optional) A bit of Context: The .ipynb format

IPYNB format

The format for a Jupyter notebook is JSON, but the extension used is ipynb. The reason behind using a different extension is twofold:

  1. Not every JSON is a valid .ipynb.
  2. The file, although plain text, is not meant to be modified manually.

When using Jupyter, the most common ways include Jupyter Notebook and Jupyter Lab, and it is through these tools that modifications are written into the underlying .json file. That ensures the file is always valid. Modifying the .json manually could make the whole notebook unreadable and such errors and difficult both to find and fix.

The official documentation for this format is available as the nbformat Python Package.

Problems with the format

This JSON-based format is also the main cause of trouble when working in teams, such as edit collaboratively and version control. That is why tools such as Yjs and nbdime were developed to solve those issues, respectively.

Developer Exchange

All that said, among developers, .json files are just ordinary everyday files, and exchanging them is nothing new. On the other hand, even if those files are easy to send, one does require a whole setup (Python + Jupyter) to properly visualize the content of such files. That is why the first and most simple way to share and publish a notebook is through the built-in Export solutions.

Using Built-Ins: Export Solutions

IPYNB format

Jupyter, in particular JupyterLab, introduces several formats for exporting the notebook. However, in this section one will be highlighted: HTML

Advantages

Disadvantages

When to use

This solution is the best when the other person, for some reason, will not run the notebook. For example, when sending sample code to a fellow developer, for documentation examples, and also for simple web hosting.

Some examples of this approach can be found in the PyMC3 official documentation, where all their examples are HTML exports of Jupyter Notebooks.

Why not other export formats?

Some may argue that PDF is also a good choice. However, due to pagination, generally the content and the output is split in unpredictable ways and thus making it harder for the reader to understand, as Jupyter Notebooks were not conceived to be read in separate pages.

Asciidoc, ReStructuredText, Markdown and LaTeX require special readers and assume that the other person also knows the format. Moreover, it is not easy to display on mobile devices without any additional software.

Executable Scripts (.py) are plain text and hence they sacrifice all the effort put in formatting and styling the markdown cells. And, if that was not the case, why bothering having a Jupyter Notebook at all to begin?

Finally, Reveal.js is meant for presentations and it is a certainly useful option for that matter, but it often shows a subset of the whole notebook and it is meant for the author itself and not for third parties.

Read-Only Solutions

Even though the last section covered HTML, which is also read-only, this section will focus on non-built-it approaches. Moreover, all of the solutions presented require some configuration or additional steps, the results might be worthwhile depending on the particular scenario.

Github Publishing

Gitub HTML

Github is the most widespread solution for hosted git version control repositories, this is also the case for the whole Project Jupyter, whose repositories are hosted in this platform. With time, they included a feature that shows the Jupyter Notebook as HTML without requiring the user to do such conversion.

The view will be the same as the one with the HTML export, there are some differences to keep in mind though.

Advantages

Disadvantages

When to use

Ideal for people and teams already working with Github Repositories. If not working with them, other options like Google Colab or NBViewer might be better. Another possible scenario is when old versions of particular notebooks should be re-visited, Github makes it easy and intuitive. Nonetheless, If cell execution or interactive plots are necessary, this option is not suitable.

Most Github Repositories with Jupyter Notebooks can be used as examples (there were more than 66.000 at the time of this writing). One popular instance is the Python implementation repository of the algorithms of the Pattern Recognition and Machine Learning by Christopher Bishop.

NBViewer Publishing

NBviewer

NBviewer is a free, online tool from the project Jupyter to display notebooks online. It combines two Python packages, NBconvert f or notebook transformations and Tornado as web service.

In this platform, notebooks are presented in HTML, the same way as with Github but, the User Interface is much more minimalistic and it is not tightly integrated with any version control system.

Advantages

Disadvantages

When to use

This solution is similar to Github's, but it is a better fit for non-developer users (e.g. mathematicians and statisticians may not have a background with version control). Furthermore, it is commonly used for writing static books, one of the most popular ones being Probabilistic-Programming-and-Bayesian-Methods-for-Hackers. As well as other HTML-based approaches, If cell execution or interactive plots are necessary, this option is not suitable.

Static Site Generator Integration

Static Site Generators

Nowadays, there is an increasing trend in popularity for Jamstack tools, i.e. Static Site Generators. That is why there are plenty of options available in the Python ecosystem (50 at the time of this writting){: target="_blank"}. That been said the most popular ones are Pelican, Lektor and Nikola based on Github Stars.

From these three options, Nikola is the only one with native support for .ipynb files, whereas Pelican and Lektor require plugins, once installed their use should be straightforward.

Another possibility is to use FastPages, an static site generator thought specifically for Jupyter Notebooks by the FastAI team. It is not as popular as the other three but it is raising in popularity in the last couple of years.

Independently of the particular generator, the advantages and disadvantages are identical.

Note: This approach is not embedding Jupyter Notebooks in a static site, but rather using the .ipynb file itself as the source. Embedding notebooks is covered as one of the Executable solutions.

Advantages

Disadvantages

When to use

This approach is more suitable for individuals and or organizations who want to keep and maintain a website/blog and at the same time showcase jupyter notebooks. It requires intermediate knowledge of software engineering and is maybe not the most adequate solution for those coming from other fields.

Furthermore, it assumes a continuous stream of new content rather than simple seldom updated publications. All the discoverability capabilities of other solutions are not included here, thus requiring additional work.

That being said, this is the only solution where the writer has full control over the implementation, the look and feel, and the UI in general, allowing to use Analytics solutions as well as some interactivity.

Why not Sphinx, MkDocs, Cactus, or X?

The idea behind publishing a Jupyter Notebook is different from software documentation, which is the main focus for Sphinx and MkDocs. Cactus is another popular alternative but its last commit was in 2017, making it obsolete. Other static site generators are less known and at the time of this writing do not present any meaningful integration with Jupyter Notebooks.

A clarification should be made about Sphinx since it is used as the back-end for another solution called Jupyter-book which will be cover below.

Jupyter-Book - Static

Jupyter-Book is a solution that can be used in Static, Executable, and printable scenarios, however, the advantages and disadvantages of each are slightly different and the level of maturity is not the same either. Therefore, in this article, it will be repeated for each category.

This tool has been migrated to the Python ecosystem (it used Ruby and Jekyll in the past), and although it is not extremely popular at the moment, the project is gaining traction and seems it would be the new standard in the short-term

As an additional resource for this article, a template repository to get started with Jupyter-Book was prepared, it works for static, executable, and also printable scenarios and can help getting started.

Advantages

Disadvantages

When to use

Despite having a "book" in the name, the actual layout is similar to modern software documentation. The reason being the main theme is inspired by the pandas docs. If the content consists of several Jupyter Books that can be ordered into sections, chapters, or a similar hierarchical structure, Jupyter-Book is extremely useful

As an example, the data visualization library Altair used this tool to build their official tutorial.

Executable Solutions

In the previous sections, all solutions aimed at providing a static representation of the notebook. That was the most convenient way to share a notebook. Nevertheless, in some use cases, the content is required to be executable. For example, for reproducibility results in a research review; or to showcase to students what happens when the code changes in a pedagogical setting.

The fact that the notebook is executed means that the kernel should run on a server and the input should be sent and the output should be fetched. There are free servers that allow to do that but certainly with limitations or additional steps for configuration.

In this category, two backends will be shown: Google Colab and MyBinder. And four different ways are presented on how to use them, two direct and two indirect.

Important Note: The execution of the code will be always on a sandbox environment, meaning that there are no risks of running insecure code, neither for the writer or the reader.

Jupyter Lite

Static Site Generators

Jupyter Lite is a young (First commit on the 21st March 2021) yet popular tool (1.6k stars) developed by the Jupyter team. The tool is still unofficial as per their docs but, managed to achieve something that was never done before: Jupyter in the Browser without a Backend. That means that having only a static server could bring a whole Jupyter Notebook/Lab environment.

This works thanks to Mozilla's Pyodide, this other project ships the scientific Python stack compiled to WebAssembly and thus a backend is not needed as everything happens in the front-end.

Advantages

Disadvantages

When to use

For proofs of concepts without complex dependencies, good internet connection, and modern hardware. This might be a great choice. Uses cases might be MOOCs or other events where participants need Jupyter and no dedicated server is provided.

However, it is not recommended when a stable and robust solution is needed.

Embedded Solution

Static Site Generators

Embedding a notebook means inserting some input/output cells into an existing web page (generally but not necessarily static). The aforementioned interaction with the back-end is taken care of by NBInteract, a python package that was developed for this very purpose. In this blog, there is a brief tutorial explaining how to do this step by step

Advantages

Disadvantages

When to use

This approach resembles the "Flash Application" or the "Java Applet" where a small piece of an external tool is embedded into a static website. In this case, that external tool is a Jupyter notebook.

One of the articles of this blog uses this approach extensively to showcase Ordinary Differential Equations with the aid of different ipywidgets.

Although requiring some setup, it is one of the less disruptive approaches for writers and readers alike.

Google Colaboratory

Google Colab

Google Colaboratory or simply Colab is a service provided for free by Google that comes with these main features:

One of the interesting features is that any public Github repo can be opened directly from Colab by changing the URL.

Advantages

Disadvantages

When to use

The fact that Colab offers GPU support is a game-changer, no other service does it for free. Some teams may choose this solution solely because of this, especially in deep learning applications.

My Binder

Static Site Generators

MyBinder or simply Binder is a free online service that creates sandbox environments to run Jupyter. It has support for both Jupyter Notebook and Jupyter Lab. However, it offers CPU-only processing capabilities.

Advantages

Disadvantages

When to use

MyBinder is a backbone service and for a notebook writer, it may not be that useful since other services like Jupyter-book or the Embedded approach use it under the hood. However, it is helpful to know some of the details of the internals to see what is possible.

The typical use case for bare Binder is to have a repo with a particularly complicated setup of dependencies that cannot be used otherwise. For example, using software requiring particular C libraries or certain operating system libraries pre-installed (e.g. FFMPEG, Libpostal, etc.).

A more concrete example is creating animations with matplotlib which requires FFMPEG for some output formats, being FFMPEG not a Python library but software that should be installed on the Operating System level. The article on Times Tables covers this precise problem

Jupyter-Book - Executable

As mentioned in the static section Jupyter-Book can be used for static and executable content. In the case of executable content, it provides support for three different services:

Since Colab and Binder were already discussed, this section will focus on Thebe.

Thebe is a service that acts as the embedded approach described above with the difference is that it is integrated with Jupyter-book since both projects are under the bigger Executable Books Project, but it is not limited to Jupyter-Book. Thebe uses MyBinder as a back-end and provides an execution environment for static sites.

Advantages

Disadvantages

When to use

Thebe is a young project (223 Github stars at the moment of this writing) and there is not a big collection of examples nor is it a well-known tool. However, when used integrated with Jupyter-Book, it gives the reader the ability to "suddenly being able to run the code examples" which is a great user experience.

At the moment, it is the ideal solution if already using Jupyter-Book, for a similar approach using a more standalone tool, the embedded approach should be preferred.

As an additional resource for this article, a template repository to get started with Jupyter-Book was prepared, it works for static, executable, and also printable scenarios and can help getting started.

Printable Solutions

In some cases like academic publishing, book writing, or learning material the preferred format is print paper. In this case, there are a variety of options but the choice is not dependent on the technological feasibility but rather on the target platform.

If writing a book, a publisher will likely require a specific format, if going self-publish, usually, only a PDF will be enough.

In case of requiring AsciiDoc, Markdown, or ReStructuredText, Jupyter has built-in options for exporting to these formats, then the particular publisher instructions should be followed. Since this may vary significantly in each case, this section will focus on the most used format of academia, LaTeX. LaTeX is a special format that can be easily converted to PDF, therefore the term PDF in this section will always refer to the LaTeX conversion to this format.

Regardless of the approach taken, it is important to note that LaTeX can be used for two purposes: articles and books. Typically articles are generated from a single notebook file (not exclusively though) and books are a collection of notebooks following a hierarchical structure.

For books, specific book templates should be used to give the proper format, including page numbers, headings, and such. That is why tools like Overleaf are recommended to structure the document.

Built-in Export: LaTeX

The native LaTeX export will convert the notebook to the LaTeX format. This is ideal for single file conversions and should be the preferred way to start articles based on Jupyter Notebooks.

Advantages

Disadvantages

When to use

The built-in LaTeX export is suitable when dealing with single notebooks that are not connected. It might be helpful to start an academic article but in such media often the code should be removed.

If the target is not an academic conference/journal or a publisher, other tools like the static site generator or the other plain text, export might be more suitable dependent on the particular use case.

Jupyter-Book - For Printing

If already using Jupyter-Book, one possibility is to use the LaTeX target and create a .tex with all the notebooks converted to LaTeX. This approach will produce a consistent and book-like structured document. Some final formatting might still be needed in third-party tools but it is much less work than the default LaTeX export.

Advantages

Disadvantages

When to use

At the moment LaTeX and PDF are not the main focus of Jupyter-Book and the design might appear basic, however, it is indeed a good starting point for books. Using the tool for single notebooks might be excessive.

As an additional resource for this article, a template repository to get started with Jupyter-Book was prepared, it works for static, executable, and also printable scenarios and can help getting started.

As an example, the official Jupyter-Book project generates a PDF version of their docs in the form of a book, using the LaTeX export option in the middle

Honorable Mentions

Some tools could be useful in some contexts but are not so widespread as the ones presented so far.

Choose the best tool

To choose the best tool, look for the use case that best represents your needs and see the available options

I need GPU Support

The only option is Colab.

I want to keep my readers on my site

Then use either the Embedded or Thebe.

I need Analytics

Use Jupyter-Book or Embedded approaches.

I need Cache for fast load time

Use NBviewer.

I want to create a portfolio using Jupyter Notebooks

Use a static site generator. This site was built using Pelican.

I have a very specific setup of dependencies

Use any of the MyBinder-based approaches:

Conclusion

There are many different tools to publish Jupyter notebooks and whole tutorials could be written about the features and possibilities of each. In this article, the most compelling ones were summarized to help the notebook writer to decide which tool is the most suitable for their use case.

If any of the information presented is outdated, or you have suggestions for new advantages or disadvantages of a particular solution please leave a comment below.

Below is a list of useful resources to continue learning and searching for inspiration.

Additional Resources

Awesome Lists:

Gallery of Jupyter-Books

Some Books built with Jupyter Notebooks: