parente.dev

Jupyter Tidbit: IPython's ! returns an SList

September 15, 2018

This post originates from a gist that supports comments, forks, and execution in binder.


Summary

IPython shell assignment (the ! operator) evaluates a command using the local shell (e.g., bash) and returns a string list (IPython.utils.text.SList). An SList is a list-like object containing "chunks" of stdout and stderr, properties for accessing those chunks in different forms, and convenience methods for operating on them.

Example

Binder

The SList.ipynb notebook below uses SList properties to access the output of a shell command as a list-like of strings, a newline-separated string, a space-separated string, and a list of pathlib.Path objects. The notebook then uses the SList.fields() and SList.grep() methods to extract columns from and search command output.

Why is this useful?

You can take advantage of the properties and methods of an SList to transform shell output into forms more amenable to further operations in Python.

For more information

See the IPython documentation about Shell Assignment for examples of executing shell commands with optional Python values as inputs. See the IPython documentation about String Lists for additional demontrations of the utility of SLists.

SList.ipynb

Start with a simple ls command.

In [1]:
ls = !ls

Note that the return type is not a simple list.

In [2]:
type(ls)
Out[2]:
IPython.utils.text.SList

It is an SList, a list-like object that contains "chunks" of stdout and stderr, properties for accessing those chunks in different forms, and convenience methods for operating on them.

In [3]:
ls
Out[3]:
['conda-bld', 'README.md', 'requirements.txt', 'SList.ipynb']

There are many ways to refer to the output from an SList as a list-like of strings.

In [4]:
ls == ls.get_list() == ls.list == ls.l
Out[4]:
True

Some properties also return the output as a newline delimited string.

In [5]:
print(ls.nlstr)
conda-bld
README.md
requirements.txt
SList.ipynb
In [6]:
ls.get_nlstr() == ls.nlstr == ls.n
Out[6]:
True

Other properties return the output as a space separated string.

In [7]:
print(ls.spstr)
conda-bld README.md requirements.txt SList.ipynb
In [8]:
ls.get_spstr() == ls.spstr == ls.s
Out[8]:
True

Still other properties return the output as a list of pathlib.Path instances.

In [9]:
ls.paths
Out[9]:
[PosixPath('conda-bld'),
 PosixPath('README.md'),
 PosixPath('requirements.txt'),
 PosixPath('SList.ipynb')]
In [10]:
ls.get_paths() == ls.paths == ls.p
Out[10]:
True
In [11]:
import pathlib
isinstance(ls.paths[0], pathlib.Path)
Out[11]:
True

These are convenient for performing further path operations in Python.

In [12]:
[p.is_dir() for p in ls.paths]
Out[12]:
[True, False, False, False]

SList objects expose a fields() method.

In [13]:
df = !df -h
In [14]:
df
Out[14]:
['Filesystem      Size  Used Avail Use% Mounted on',
 'overlay         981G  577G  404G  59% /',
 'tmpfs            26G     0   26G   0% /dev',
 'tmpfs            26G     0   26G   0% /sys/fs/cgroup',
 '/dev/sda1       981G  577G  404G  59% /etc/hosts',
 'shm              64M     0   64M   0% /dev/shm',
 'tmpfs            26G     0   26G   0% /sys/firmware']

fields splits the output into whitespace delimited columns and returns the values of columns, specified by their indices, as space-separated strings.

In [15]:
df.fields(0,4)
Out[15]:
['Filesystem Use%',
 'overlay 59%',
 'tmpfs 0%',
 'tmpfs 0%',
 '/dev/sda1 59%',
 'shm 0%',
 'tmpfs 0%']

SList objects also expose a grep() method. grep evaluates a regular expression or callable against all elements of the SList or a whitespace delimited column in each element.

In [16]:
df.grep('dev', field=5)
Out[16]:
['tmpfs            26G     0   26G   0% /dev',
 'shm              64M     0   64M   0% /dev/shm']

grep(prune=True) turns the grep into a filtering operation instead of a matching operation.

In [17]:
hosts = !cat /etc/hosts
In [18]:
print(hosts.n)
# Kubernetes-managed hosts file.
127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.12.9.251 jupyter-3675d82eae802db2c011037033d614a5-2dlwedcif3

The return value of grep (and fields) is another SList supporting all of the features noted above.

In [19]:
print(hosts.grep('ip6', prune=True).n)
# Kubernetes-managed hosts file.
127.0.0.1   localhost
10.12.9.251 jupyter-3675d82eae802db2c011037033d614a5-2dlwedcif3

Another Read: Jupyter Tidbit: Run a notebook headlessly »

You can use nbcovert to execute a notebook from the command line (aka headlessly) and store the results in a new notebook file, an HTML file, a PDF, etc. Tools based on nbconvert, like papermill and nbflow, take this capability a step further and let you easily parameterize and chain notebooks.