parente.dev

Four Ways to Extend Jupyter Notebook

July 19, 2015

Jupyter Notebook (née IPython Notebook) is a web-based environment for interactive computing in notebook documents. In addition to supporting the execution of user-defined code, Jupyter Notebook has a variety of plug-in points that you can use to extend the capabilities of the authoring environment itself. In this post, I'll touch on four of these extension mechanisms and finish off with a word on packaging and distribution:

  1. Kernels
  2. IPython Kernel Extensions
  3. Notebook Extensions
  4. Notebook Server Extensions

One word of caution before you proceed: the Jupyter extension architecture is still evolving. Not all of the APIs and techniques I mention in this post are well-documented or considered stable yet.

1. Kernels

Kernels are probably the most well-known type of Jupyter extension. Kernels provide the Notebook, and other Jupyter frontends, the ability to execute and introspect user code in a variety of languages.

Installing a kernel amounts to satisfying its dependencies and placing a kernel spec in your Jupyter configuration directory (e.g., ~/.local/share/jupyter/kernels/<kernel name>). For example, the IRKernel README) states instructions for its installation, including its kernel spec which also appears below for reference.

{
  "argv": ["R", "-e", "IRkernel::main()", "--args", "{connection_file}"],
  "display_name": "R"
}

Creating new kernels can be as simple as implementing a Python ipykernel.kernelbase.Kernel subclass or as complex as implementing a kernel program in your language of choice that talks the Jupyter protocol over ZeroMQ. The jupyter_client documentation, experienced kernel authors, and the source of existing community contributed kernels all serve as great references in this task.

2. IPython Kernel Extensions

IPython Kernel extensions are Python modules that can modify the interactive shell environment within an IPython kernel. Such extensions can register magics, define variables, and generally modify the user namespace to provide new features for use within code cells. Kernel extensions are not exclusive to the Jupyter Notebook frontend, but many community contributed extensions do target its browser-powered display system in particular.

The IPython kernel ships with four magics that allow for the management of extensions from within Jupyter Notebook (or other Jupyter interfaces).

  1. %install_ext <URL|path> installs a Python module as an extension (deprecated)
  2. %load_ext <name> imports the extension module and invokes its load function
  3. %reload_ext <name> invokes the extension module unload function, re-imports the extension module itself, and then invokes its load function
  4. %unload_ext <name> invokes the extension module unload function and drops the module reference

The load, reload, and unload magics act solely on the kernel associated with the notebook in which they appear. Every time that kernel restarts, you must run the load magic again to re-enable the extension.

The IPython configuration system exposes the InteractiveShellApp.extensions list trait to automate the loading of kernel extensions. For example, you can add the following lines to your IPython configuration file (e.g., ~/.ipython/profile_default/ipython_config.py) to automatically load module my_package.my_kernel_extension any time an IPython kernel starts or restarts.

c.InteractiveShellApp.extensions = [
    'my_package.my_kernel_extension'
]

Writing a Python module that can serve as a kernel extension requires implementing a load_ipython_extension function and optionally implementing a unload_ipython_extension function. Both functions receive an InteractiveShell instance as their one and only parameter. The loading function typically uses methods on the instance to add features while the unload function cleans them up. For instance, a minimal extension that defines skip %%skip, a cell magic that turns the current cell into a no-op, appears below.

def skip(line, cell=None):
    '''Skips execution of the current line/cell.'''
    pass

def load_ipython_extension(shell):
    '''Registers the skip magic when the extension loads.'''
    shell.register_magic_function(skip, 'line_cell')

def unload_ipython_extension(shell):
    '''Unregisters the skip magic when the extension unloads.'''
    del shell.magics_manager.magics['cell']['skip']

3. Notebook Extensions

Jupyter Notebook extensions (nbextensions) are JavaScript modules that can load on most major views constituting the Notebook frontend. Once loaded, they have access to the complete page DOM and frontend Jupyter JavaScript API with which to modify the user experience. As their name suggests, these extensions are exclusive to the Notebook frontend for Jupyter and typically add features to the notebook authoring user interface.

Today, the recommended way of installing, loading, and unloading extensions requires running a few snippets of code either within a notebook associated with an IPython kernel or in an external Python script. To install, for example, minrk's gist nbextension, you can execute the following code in a notebook.

import notebook.nbextensions
notebook.nbextensions.install_nbextension('https://rawgithub.com/minrk/ipython_extensions/master/nbextensions/gist.js', user=True)

The code above downloads the JavaScript file and copies it into your Jupyter data directory (e.g., ~/.local/share/jupyter/nbextensions). Once installed, you can load the gist extension by executing the following JavaScript in a code cell.

%%javascript
Jupyter.utils.load_extensions('gist')

The cell above emits a <script> element into the output area of the notebook cell. That script loads the gist.js JavaScript module from the Notebook server backend. On load, the gist extension adds a button appears to the toolbar which you can click to post the current notebook document as a gist.

After you save the notebook, the <script> element will persist in it. This script block will execute whenever you (but not necessarily others) open this particular notebook document. To stop this behavior, you can delete the cell and refresh the page.

If you want to load the extension automatically whenever you open any notebook document, you must add the extension to the notebook configuration section of your Jupyter profile. You can do so with the following code.

from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('notebook', {"load_extensions": {"gist": True}})

You can stop the extension from loading automatically across notebooks using similar code. Pay attention to the use of None instead of False as the value that disables the extension.

from notebook.services.config import ConfigManager
cm = ConfigManager()
# update with None, not False, to disable auto loading
cm.update('notebook', {"load_extensions": {"gist": None}})

Take note of the first parameter value in the ConfigManager.update invocation. It associates the extension with one of the primary Jupyter Notebook views, in this case the notebook editor view. While most extensions, like gist, operate on the notebook view, it is possible to write and load extensions against other views such as the text editor (edit) and dashboard (tree), or all views (common).

To write a new extension, you must implement your logic in a JavaScript file conforming to the AMD specification so that Jupyter can load it using RequireJS. You should define and export a load_ipython_extension function in your module so that Jupyter can invoke it after initializing its own components. Within that function, you are free to manipulate the DOM of the page, invoke Jupyter JavaScript APIs, listen for Jupyter events, load other modules, and so on.

As an example, here is an extension that registers a command-mode hotkey to show a dialog of notebook cell counts.

define(["base/js/namespace"], function (Jupyter) {
  var exports = {};

  // Show counts of cell types
  var show_stats = function () {
    // Get counts of each cell type
    var cells = Jupyter.notebook.get_cells();
    var hist = {};
    for (var i = 0; i < cells.length; i++) {
      var ct = cells[i].cell_type;
      if (hist[ct] === undefined) {
        hist[ct] = 1;
      } else {
        hist[ct] += 1;
      }
    }

    // Build paragraphs of cell type and count
    var body = $("<div>");
    for (var ct in hist) {
      $("<p>")
        .text(ct + ": " + hist[ct])
        .appendTo(body);
    }

    // Show a modal dialog with the stats
    Jupyter.dialog.modal({
      title: "Notebook Stats",
      body: body,
      buttons: {
        OK: {},
      },
    });
  };

  // Wait for notification that the app is ready
  exports.load_ipython_extension = function () {
    // Then register command mode hotkey "s" to show the dialog
    Jupyter.keyboard_manager.command_shortcuts.add_shortcut("s", show_stats);
  };

  return exports;
});

4. Notebook Server Extensions

Jupyter Notebook server extensions are Python modules that load when the Notebook web server application starts. The only way to load server extensions at present is by way of the Jupyter configuration system. You must specify the extensions to load prior to starting the Jupyter Notebook server itself. Any changes to the list of extensions or the extensions themselves require a restart of the Notebook process to take effect.

For example, you can add the following lines to the Notebook configuration file in your Jupyter profile (e.g., ~/.jupyter/jupyter_notebook_config.py) to automatically load server extension my_package.my_server_extension when the Notebook app launches.

c.NotebookApp.server_extensions = [
    'my_package.my_server_extension'
]

Creating a Python module that acts as a server extension requires implementing a load_jupyter_server_extension function. The function receives an instance of notebook.notebookapp.NotebookApp as its sole parameter. The function can use functions and attributes of this instance to customize and extend the server behavior. Most notably, the web_app attribute of the NotebookApp instance refers to an instance of a tornado.web.Application subclass. You can register new tornado.web.RequestHandlers via that instance to extend the backend API of Jupyter Notebook.

To demonstrate the concept, an extension that adds a (dumb) "hello world" handler to the Notebook server appears below. (More compelling examples likely involve both Notebook server and frontend extensions working in conjunction.)

from notebook.utils import url_path_join
from notebook.base.handlers import IPythonHandler

class HelloWorldHandler(IPythonHandler):
    def get(self):
        self.finish('Hello, world!')

def load_jupyter_server_extension(nb_app):
    '''
    Register a hello world handler.

    Based on https://github.com/Carreau/jupyter-book/blob/master/extensions/server_ext.py
    '''
    web_app = nb_app.web_app
    host_pattern = '.*$'
    route_pattern = url_path_join(web_app.settings['base_url'], '/hello')
    web_app.add_handlers(host_pattern, [(route_pattern, HelloWorldHandler)])

A Final Word: Packaging and Distribution

Packaging and distributing extensions is a somewhat ad-hoc process at the moment. Planning and discussion are in progress to improve the state of affairs. That said, it is possible to share your work with a bit of effort today. For example, the sample setup.py below shows how a custom setuptools command class can install IPython kernel extensions, Jupyter Notebook JavaScript extensions, and Jupyter server extensions.

import os
from setuptools import setup
from setuptools.command.install import install

from notebook.nbextensions import install_nbextension
from notebook.services.config import ConfigManager
from jupyter_core.paths import jupyter_config_dir

EXT_DIR = os.path.join(os.path.dirname(__file__), 'myext')

class InstallCommand(install):
    def run(self):
        # Install Python package
        install.run(self)

        # Install JavaScript extensions to ~/.local/jupyter/
        install_nbextension(EXT_DIR, overwrite=True, user=True)

        # Activate the JS extensions on the notebook, tree, and edit screens
        js_cm = ConfigManager()
        js_cm.update('notebook', {"load_extensions": {'myext_js/notebook': True}})
        js_cm.update('tree', {"load_extensions": {'myext_js/dashboard': True}})
        js_cm.update('edit', {"load_extensions": {'myext_js/editor': True}})

        # Activate the Python server extension
        server_cm = ConfigManager(config_dir=jupyter_config_dir())
        cfg = server_cm.get('jupyter_notebook_config')
        server_extensions = (cfg.setdefault('NotebookApp', {})
            .setdefault('server_extensions', [])
        )
        if extension not in server_extensions:
            cfg['NotebookApp']['server_extensions'] += ['myext.my_handler']
            server_cm.update('jupyter_notebook_config', cfg)

setup(
    name='myext',
    version='0.1',
    packages=['myext'],
    cmdclass={
        'install': InstallCommand
    }
)

If the code seems lengthy, consider jupyter-pip which wraps some of the techniques above (and others) in a nice, reusable package. No matter which approach you choose, a user can open a Python notebook, run pip install against the root of your extension project, and wind up with all of your extension components put in their proper place. But not all is roses: Building packages for PyPI, supporting pip uninstall, and running the install outside of an active Jupyter Notebook context all require extra work.

Changelog

Another Read: Control Read/Write Access to Your Private Docker Registry »

Last year, I wrote a post on how to run a private Docker registry backed by SoftLayer Object Storage. Soon after, my team and I started using such a registry at work behind an nginx proxy requiring basic authentication. This setup, documented in numerous places on the web, sufficed for the last six months: it let our team to push and pull images while denying anonymous access.