Recommended Docker Images and Tags
Most of my Docker images have different variants
(corresponding to tags latest
, next
, etc)
for different use cases.
And each tag might have histocial versions
with the pattern mmddhh
(mm
, dd
and hh
stand for the month, day and hour)
for fallback if a tag is broken.
Please refer to the following tag table for more details.
Tag | Base Image OS | Comment |
---|---|---|
latest | Ubuntu LTS (or newer if necessary and well tested) |
The most recent stable version of the Docker image.
The latest tag is what most users should use.
It cares more about user friendliness than Docker image size, load speed and even security. |
next | Ubuntu LTS (or newer if necessary and well tested) |
The most recent testing version of the Docker image.
New features/tools will be added into the next tag before entering other tags. |
23.04 | Ubuntu 23.04 |
For any of the following situations: 1. a specific Ubuntu/kernel version is required 2. trying out newer Ubuntu versions than LTS |
mmddhh | Histoical versions corresponding to the latest tag. | Fallback tags (for latest) if the latest tag is broken. |
next_mmddhh | Histoical versions corresponding to the next tag. | Fallback tags (for next) if the next tag is broken. |
Docker Image | Comment |
---|---|
dclong/vscode-server | Cloud IDE code-server (based on VSCode) |
dclong/jupyterhub-ds | Traditional ML |
dclong/jupyterhub-pytorch | Deep Learning |
dclong/python-portable | Build portable Python using python-build-standalone |
dclong/conda-build | Build portable Anaconda Python environment |
dclong/jupyterhub-sagemath | Math / Calculus |
dclong/gitpod:blog | For editing and publishing legendu.net/blog using GitPod |
dclong/jupyterhub-ds:blog_070203 | For editing and publishing legendu.net/blog. |
dclong/gitpod | Editing other GitHub repos using GitPod |
dclong/jupyterhub-kotlin dclong/jupyterhub-ganymede |
JVM languages |
dclong/rustpython | RustPython |
Usage
Install Docker
Please refer to Install Docker for instructions on how to install and configure Docker.
Pull the Docker Image
Taking dclong/jupyterhub-ds
as an example,
you can pull it using the command below.
docker pull dclong/jupyterhub-ds
For people in mainland of China, please refer to the post Speedup Docker Pulling and Pushing on ways to speed up pushing/pulling of Docker images. If you don't bother, then just use the command below.
docker pull registry.docker-cn.com/dclong/jupyterhub-ds
Start a Container using ldc
The recommended way to start containers for Docker images dclong/*
is to use the ldc
command which comes with
icon
.
Start a Container Manually
Below are explanation of some environment variable passed by the option -e
to the Docker command.
Keep the default if you don't know what are the best to use.
DOCKER_PASSWORD
is probably the only one you want to and should change.
DOCKER_USER
: The user to be created (dynamically) in the Docker container. The shell commandid -un
gets the name of the current user (on the host), so the option-e DOCKER_USER=$(id -un)
instructs the script/scripts/sys/init.sh
to create a user in the Docker container whose name is the same as the current user on the host. WARNING: the shell script/scripts/sys/init.sh
cannot create a user namedroot
as it already exists in the Docker container. If you start a Docker container usingroot
, make sure to pass a different user name to the envrionment variableDOCKER_USER
, e.g.,-e DOCKER_USER=dclong
. For more discussion, please refer to this issue.DOCKER_USER_ID
: The ID of the user to be created in the Docker container. The shell commandid -u
gets the user ID of the current user (on the host), so the option-e DOCKER_USER_ID=$(id -u)
instructs the script/scripts/sys/init.sh
to create a user in the Docker container whose user ID is the same as the user ID of the current user on the host. This means that the user in the Docker container is essentailly the current user on the host, which helps resolve file permissions between the Docker container and the host. This option is similar to the option--user
of the commanddocker run
, and you want to keep it unchanged, generally speaking.DOCKER_PASSWORD
: The password of the user to be created in the Docker container. The shell commandid -un
get the name of the current user (on the host), so the option-e DOCKER_PASSWORD=$(id -un)
instructs the script/scripts/sys/init.sh
to create a user in the Docker container whose password is the name of the current user on the host. WARNING: You'd better change the default value for security reasons. Of course, users can always change it later using the commandpasswd
inside the Docker container.DOCKER_GROUP_ID
: The group ID of the user to be created in the Docker container. The shell commandid -g
gets the group ID of the current user (on the host), so the option-e DOCKER_GROUP_ID=$(id -g)
instructs the script/scripts/sys/init.sh
to create a user in the Docker container whose group ID is the same as the group ID of the current user on the host. You want to keep this option unchanged, generally speaking.DOCKER_ADMIN_USER
: This environment variable applies to Docker imagesdclong/jupyterhub*
only. It specifies the admin user of the JupyterHub server. It should be the same asDOCKER_USER
generally speaking.USER_MEM_LIMIT
: This environment variable applies to Docker imagesdclong/jupyterhub*
only. It limits the memory that each user can use. Note: this optional is not in effect currently.
The root directory of JupyterLab/Jupyter notebooks is /workdir
in the container.
You can mount directory on the host to it as you wish.
Below are illustration using the Docker image dclong/jupyterhub-ds
.
The following command starts a container
and mounts the current working directory and /home
on the host machine
to /workdir
and /home_host
in the container respectively.
docker run -d --init \
--platform linux/amd64 \
--hostname jupyterhub-ds \
--log-opt max-size=50m \
-p 8000:8000 \
-p 5006:5006 \
-e DOCKER_USER=`id -un` \
-e DOCKER_USER_ID=`id -u` \
-e DOCKER_PASSWORD=`id -un` \
-e DOCKER_GROUP_ID=`id -g` \
-e DOCKER_ADMIN_USER=`id -un` \
-v `pwd`:/workdir \
-v `dirname $HOME`:/home_host \
dclong/jupyterhub-ds /scripts/sys/init.sh
The following command (only works on Linux) does the same as the above one except that it limits the use of CPU and memory.
docker run -d --init \
--platform linux/amd64 \
--name jupyterhub-ds \
--log-opt max-size=50m \
--memory=$(($(head -n 1 /proc/meminfo | awk '{print $2}') * 4 / 5))k \
--cpus=$((`nproc` - 1)) \
-p 8000:8000 \
-p 5006:5006 \
-e DOCKER_USER=`id -un` \
-e DOCKER_USER_ID=`id -u` \
-e DOCKER_PASSWORD=`id -un` \
-e DOCKER_GROUP_ID=`id -g` \
-e DOCKER_ADMIN_USER=`id -un` \
-v `pwd`:/workdir \
-v `dirname $HOME`:/home_host \
dclong/jupyterhub-ds /scripts/sys/init.sh
Add a New User Inside a Docker Container
You can of course use the well know commands useradd
, adduser
, etc. to achive it.
To make things easier for you,
there are some shell scripts in the directory /scripts/sys/
to create usres for you.
/scripts/sys/create_user.sh
: Create a new user. It's the base script for creating users./scripts/sys/create_user_group.sh
: Create a new user with the given (existing) group./scripts/sys/create_user_nogroup.sh
: Create a new user with group namenogroup
./scripts/sys/create_user_docker.sh
: Create a new user with group namedocker
.
You can use the option -h
to print help doc for these commands.
/scripts/sys/create_user_nogroup.sh -h
Create a new user with the group name "nogroup".
Syntax: create_user_nogroup user user_id [password]
Arguments:
user: user name
user_id: user id
password: Optional password of the user. If not provided, then the user name is used as the password.
Now suppose you want to create a new user dclong
with user ID 2000
and group name nogroup
,
you can use the following command.
sudo /scripts/sys/create_user_nogroup.sh dclong 2000
Since we didn't specify a password for the user, the default password (same as the user name) is used.
Use the JupyterHub Server
-
Open your browser and and visit
your_host_ip:8000
whereyour_host_ip
is the URL/ip address of your server. -
Login to the JupyterHub server using your user name (by default your user name on the host machine) and password (by default your user name on the host machine).
-
It is strongly suggested (for security reasons) that you change your password (using the command
passwd
) in the container. -
Enjoy JupyterLab notebook!
Get Information of Running Jupyter/Lab Servers
If you are using the Jupyter/Lab server instead of JupyterHub,
you will be asked for a token at login.
If you have started the Docker container in interactive mode (option -i
instead of -d
),
the token for login is printed to the console.
Otherwise,
the tokens (and more information about the servers) can be found
by running the following command outside the Docker container.
docker exec jupyterlab /scripts/list_jupyter.py
The above command tries to be smart in the sense that it first figures out the user that started the JupyterLab server and then query running Jupyter/Lab servers of that user. An equivalently but more specifically command (if the Docker is launched by the current user in the host) is as below
docker exec -u $(id -un) jupyterlab /scripts/sys/list_jupyter.py
If you are inside the Docker container, then run the following command to get the tokens (and more information about the servers).
/scripts/list_jupyter.py
Or equivalently if the Jupyter/Lab server is launched by the current user,
/scripts/sys/list_jupyter.py
To sum up,
most of time you can rely on /scripts/list_jupyter.py
to find the tokens of the running Jupyter/Lab servers,
no matter you are root or the user that launches the Docker/JupyterLab server,
and no matter you are inside the Docker container or not.
Yet another way to get information of the running JupyterLab server
is to check the log.
Please refer to the section
Debug Docker Containers
for more information.
Add a New User for JupyterHub
By default,
any user in a Docker container of dclong/jupyterhub-*
can visit the JupyterHub server.
So if you want to grant access to a new user,
just create an account for him in the Docker container.
Please refer to
Add a New User Inside a Docker Container
on how to create a new user inside a Docker container.
Easy Install of Other Kernels
Install and configure PySpark for use with the Python kernel.
icon spark -ic && icon pyspark -ic
Install the evcxr Rust kernel.
icon evcxr -ic
Install the Almond Scala kernel.
icon almond -ic
Install the ITypeScript kernel.
icon its -ic
Many other software/tools can be easily install by icon .
Debug Docker Containers
You can change the option docker run -d ...
to docker run -it ...
to show logs of processes in the Docker container which helps debugging.
If you have already started a Docker container using docker run -d ...
,
you can use the command
docker logs
to get the log of the container
(which contains logs of all processes in it).
Use Spark in JupyterLab Notebooks
It is suggested that you use the Almond Scala kernel. I will gradually drop support of the BeakerX Scala kernel in my Docker images.
PySpark - pyspark and findspark
To use PySpark in a container of the Docker image
dclong/jupyterhub-ds
you need to install Spark and the Python package pyspark
first,
which can be achieved using the following command.
icon spark -ic --loc /opt/
icon pyspark -ic
Follow the steps below to use PySpark after it is installed.
-
Open a JupyterLab notebook with the Python kernel from the launcher.
-
Find and initialize PySpark.
import findspark # A symbolic link of the Spark Home is made to /opt/spark for convenience findspark.init("/opt/spark") from pyspark.sql import SparkSession spark = SparkSession.builder.appName('PySpark Example').enableHiveSupport().getOrCreate()
-
Use Spark as usual.
df1 = spark.table("some_hive_table") df2 = spark.sql("select * from some_table") ...
Remote Connection to Desktop in the Container
If you are running a Docker container with a desktop environment (dclong/lubuntu*
or dclong/xubuntu*
),
you can connect to the desktop environment in the Docker container using NoMachine.
- Download the NoMachine client from https://www.nomachine.com/download.
- Install the NoMachine client on your computer.
- Create a new connection from your computer to the desktop environment in the Docker image using the NX protocol and port 4000. You will be asked for a user name and password. By default, the user name used to start the Docker container on the host machine is used as both the user name and password in the Docker container.
List of Images and Detailed Information
-
OS: Ubuntu LTS Time Zone: US Pacific Time
Desktop Environment: None
Remote Desktop: None-
Python 3.10.x
-
-
JupyterLab: 3.5.x
-
JupyterHub: latest stable version.
-
Julia stable.
-
OpenJDK: 11
Maven: 3.6.x-
Go
Rust
JavaScript/ TypeScript-
code-server: 4.9.x
-
Python packages:
- loguru pysnooper
- numpy scipy pandas pyarrow
- scikit-learn lightgbm
- graphviz matplotlib bokeh holoviews[recommended] hvplot
- tabulate
- JPype1 sqlparse
- requests[socks] lxml notifiers
- dsutil
-
-
-
-
Build my Docker Images
My Docker images are auto built leveraging GitHub Actions workflow in the GitHub repository docker_image_builder .
Tips for Maintaining Docker Images (for My own Reference)
-
Do NOT update the
latest
tag until you have fully tested the correspondingnext
tag. -
Do NOT add new features or tools unless you really need them.
-
It generally a good idea to restrict versions of non-stable packages to be a specific (working) version or a range of versions that is unlike to break.
-
If you REALLY have to update some Bash script a Docker image, do not update it in the GitHub repository directly and build the Docker image to test whether it works. Instead, make a copy of the Bash script outside the Docker container, update it, and mount it into the container to test whether it work. If the updated Bash script work as you expected, then go ahead to update it in the GitHub repository.
On Failure of GitHub Actions Workflow for Building Docker Images
- If the Docker image buidling workflow fails due to network issues, it might not work to rerun failed pipelines in GitHub Actions (as the network issue is like due to probalematic network nodes and retrying failed pipelines sending jobs to the same nodes). In such situtions, it is better to trigger a new run of the workflow.
Known Issues
-
NeoVim fails to work if a Docker image (with NeoVim installed) is run on Mac with the M1 chip even if you pass the option
--platform linux/amd64
todocker run
. A possible fix is to manually uninstall NeoVim using the following commandwajig purge neovim
and then install Vim instead.
wajig install vim
-
The command
wajig
fails to cache password if a Docker image (withwajig
installed) is run on Mac with the M1 chip even if you pass the option--platform linux/amd64
todocker run
. Fortunately, this is not a big issue. -
There is an issue with the
dclong/xubuntu*
Docker images due to Xfce on Ubuntu 18.04. It is suggested that you use the correspondingdclong/lubuntu*
Docker images instead (which are based on LXQt) if a desktop environment is needed.