Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Deprecated in favor of My Podman Container Images .

Most of my Docker images have different variants (corresponding to tags latest, next, etc) for different use cases. And each tag might have histocial versions with the pattern mmddhh (mm, dd and hh stand for the month, day and hour) for fallback if a tag is broken. Please refer to the following tag table for more details.

Tag

Base Image OS

Comment

latest

Ubuntu LTS (or newer if necessary and well tested)

The most recent stable version of the Docker image. The latest tag is what most users should use. It cares more about user friendliness than Docker image size, load speed and even security.

next

Ubuntu LTS (or newer if necessary and well tested)

The most recent testing version of the Docker image. New features/tools will be added into the next tag before entering other tags.

mmddhh

Histoical versions corresponding to the latest tag.

Fallback tags (for latest) if the latest tag is broken.

next_mmddhh

Histoical versions corresponding to the next tag.

Fallback tags (for next) if the next tag is broken.

Docker Image

Comment

dclong/vscode-server

Cloud IDE code-server (based on VSCode)

dclong/jupyterhub-ds

For data science, machine learning and AI.

dclong/python-portable

Build portable Python using python-build-standalone

dclong/jupyterhub-sagemath

Math / Calculus

dclong/jupyterhub-kotlin
dclong/jupyterhub-ganymede

JVM languages

dclong/rustpython

RustPython

dclong/jupyterhub-pytorch

Deep Learning

dclong/gitpod

Editing other GitHub repos using GitPod

Usage

Install Docker

Please refer to Install Docker for instructions on how to install and configure Docker.

Pull the Docker Image

Taking dclong/jupyterhub-ds as an example, you can pull it using the command below.

docker pull dclong/jupyterhub-ds

For people in mainland of China, please refer to the post Speedup Docker Pulling and Pushing on ways to speed up pushing/pulling of Docker images. If you don’t bother, then just use the command below.

docker pull registry.docker-cn.com/dclong/jupyterhub-ds

Start a Container using ldc

The recommended way to start containers for Docker images dclong/* is to use the ldc command which comes with icon .

Start a Container Manually

Below are explanation of some environment variable passed by the option -e to the Docker command. Keep the default if you don’t know what are the best to use. DOCKER_PASSWORD is probably the only one you want to and should change.

The root directory of JupyterLab/Jupyter notebooks is /workdir in the container. You can mount directory on the host to it as you wish. Below are illustration using the Docker image dclong/jupyterhub-ds.

The following command starts a container and mounts the current working directory and /home on the host machine to /workdir and /home_host in the container respectively.

docker run -d --init \
    --platform linux/amd64 \
    --hostname jupyterhub-ds \
    --log-opt max-size=50m \
    -p 8000:8000 \
    -p 5006:5006 \
    -e DOCKER_USER=`id -un` \
    -e DOCKER_USER_ID=`id -u` \
    -e DOCKER_PASSWORD=`id -un` \
    -e DOCKER_GROUP_ID=`id -g` \
    -e DOCKER_ADMIN_USER=`id -un` \
    -v `pwd`:/workdir \
    -v `dirname $HOME`:/home_host \
    dclong/jupyterhub-ds /scripts/sys/init.sh

The following command (only works on Linux) does the same as the above one except that it limits the use of CPU and memory.

docker run -d --init \
    --platform linux/amd64 \
    --name jupyterhub-ds \
    --log-opt max-size=50m \
    --memory=$(($(head -n 1 /proc/meminfo | awk '{print $2}') * 4 / 5))k \
    --cpus=$((`nproc` - 1)) \
    -p 8000:8000 \
    -p 5006:5006 \
    -e DOCKER_USER=`id -un` \
    -e DOCKER_USER_ID=`id -u` \
    -e DOCKER_PASSWORD=`id -un` \
    -e DOCKER_GROUP_ID=`id -g` \
    -e DOCKER_ADMIN_USER=`id -un` \
    -v `pwd`:/workdir \
    -v `dirname $HOME`:/home_host \
    dclong/jupyterhub-ds /scripts/sys/init.sh

Add a New User Inside a Docker Container

You can of course use the well know commands useradd, adduser, etc. to achive it. To make things easier for you, there are some shell scripts in the directory /scripts/sys/ to create usres for you.

You can use the option -h to print help doc for these commands.

/scripts/sys/create_user_nogroup.sh -h
Create a new user with the group name "nogroup".
Syntax: create_user_nogroup user user_id [password]
Arguments:
user: user name
user_id: user id
password: Optional password of the user. If not provided, then the user name is used as the password.

Now suppose you want to create a new user dclong with user ID 2000 and group name nogroup, you can use the following command.

sudo /scripts/sys/create_user_nogroup.sh dclong 2000

Since we didn’t specify a password for the user, the default password (same as the user name) is used.

Use the JupyterHub Server

  1. Open your browser and and visit your_host_ip:8000 where your_host_ip is the URL/ip address of your server.

  2. Login to the JupyterHub server using your user name (by default your user name on the host machine) and password (by default your user name on the host machine).

  3. It is strongly suggested (for security reasons) that you change your password (using the command passwd) in the container.

  4. Enjoy JupyterLab notebook!

Get Information of Running Jupyter/Lab Servers

If you are using the Jupyter/Lab server instead of JupyterHub, you will be asked for a token at login. If you have started the Docker container in interactive mode (option -i instead of -d), the token for login is printed to the console. Otherwise, the tokens (and more information about the servers) can be found by running the following command outside the Docker container.

docker exec jupyterlab /scripts/list_jupyter.py

The above command tries to be smart in the sense that it first figures out the user that started the JupyterLab server and then query running Jupyter/Lab servers of that user. An equivalently but more specifically command (if the Docker is launched by the current user in the host) is as below

docker exec -u $(id -un) jupyterlab /scripts/sys/list_jupyter.py

If you are inside the Docker container, then run the following command to get the tokens (and more information about the servers).

/scripts/list_jupyter.py

Or equivalently if the Jupyter/Lab server is launched by the current user,

/scripts/sys/list_jupyter.py

To sum up, most of time you can rely on /scripts/list_jupyter.py to find the tokens of the running Jupyter/Lab servers, no matter you are root or the user that launches the Docker/JupyterLab server, and no matter you are inside the Docker container or not. Yet another way to get information of the running JupyterLab server is to check the log. Please refer to the section Debug Docker Containers for more information.

Add a New User for JupyterHub

By default, any user in a Docker container of dclong/jupyterhub-* can visit the JupyterHub server. So if you want to grant access to a new user, just create an account for him in the Docker container. Please refer to Add a New User Inside a Docker Container on how to create a new user inside a Docker container.

Easy Install of Other Kernels

Install and configure PySpark for use with the Python kernel.

icon spark -ic && icon pyspark -ic

Install the evcxr Rust kernel.

icon evcxr -ic

Install the Almond Scala kernel.

icon almond -ic

Install the ITypeScript kernel.

icon its -ic

Many other software/tools can be easily install by icon .

Debug Docker Containers

You can change the option docker run -d ... to docker run -it ... to show logs of processes in the Docker container which helps debugging. If you have already started a Docker container using docker run -d ..., you can use the command docker logs to get the log of the container (which contains logs of all processes in it).

Use Spark in JupyterLab Notebooks

It is suggested that you use the Almond Scala kernel. I will gradually drop support of the BeakerX Scala kernel in my Docker images.

PySpark - pyspark and findspark

To use PySpark in a container of the Docker image dclong/jupyterhub-ds you need to install Spark and the Python package pyspark first, which can be achieved using the following command.

icon spark -ic --loc /opt/
icon pyspark -ic

Follow the steps below to use PySpark after it is installed.

  1. Open a JupyterLab notebook with the Python kernel from the launcher.

  2. Find and initialize PySpark.

     import findspark
     # A symbolic link of the Spark Home is made to /opt/spark for convenience
     findspark.init("/opt/spark")
    
     from pyspark.sql import SparkSession
     spark = SparkSession.builder.appName('PySpark Example').enableHiveSupport().getOrCreate()
  3. Use Spark as usual.

     df1 = spark.table("some_hive_table")
     df2 = spark.sql("select * from some_table")
     ...

Remote Connection to Desktop in the Container

If you are running a Docker container with a desktop environment (dclong/lubuntu* or dclong/xubuntu*), you can connect to the desktop environment in the Docker container using NoMachine.

  1. Download the NoMachine client from https://www.nomachine.com/download.

  2. Install the NoMachine client on your computer.

  3. Create a new connection from your computer to the desktop environment in the Docker image using the NX protocol and port 4000. You will be asked for a user name and password. By default, the user name used to start the Docker container on the host machine is used as both the user name and password in the Docker container.

List of Images and Detailed Information

Build my Docker Images

My Docker images are auto built leveraging GitHub Actions workflow build_images.yml .

Tips for Maintaining Docker Images (for My own Reference)

  1. Do NOT update the latest tag until you have fully tested the corresponding next tag.

  2. Do NOT add new features or tools unless you really need them.

  3. It generally a good idea to restrict versions of non-stable packages to be a specific (working) version or a range of versions that is unlike to break.

  4. If you REALLY have to update some Bash script a Docker image, do not update it in the GitHub repository directly and build the Docker image to test whether it works. Instead, make a copy of the Bash script outside the Docker container, update it, and mount it into the container to test whether it work. If the updated Bash script work as you expected, then go ahead to update it in the GitHub repository.

On Failure of GitHub Actions Workflow for Building Docker Images

  1. If the Docker image buidling workflow fails due to network issues, it might not work to rerun failed pipelines in GitHub Actions (as the network issue is like due to probalematic network nodes and retrying failed pipelines sending jobs to the same nodes). In such situtions, it is better to trigger a new run of the workflow.

Known Issues

  1. NeoVim fails to work if a Docker image (with NeoVim installed) is run on Mac with the M1 chip even if you pass the option --platform linux/amd64 to docker run. A possible fix is to manually uninstall NeoVim using the following command

    sudo apt purge neovim

    and then install Vim instead.

    sudo apt install vim
  2. There is an issue with the dclong/xubuntu* Docker images due to Xfce on Ubuntu 18.04. It is suggested that you use the corresponding dclong/lubuntu* Docker images instead (which are based on LXQt) if a desktop environment is needed.