Update requirements.txt
Nick Cox authored
09d9d77d
© Copyright 2021 European Space Agency, 2021
This file is subject to the terms and conditions defined 
in file 'LICENSE.txt', which is part of this 
[source code/executable] package. No part of the package, 
including this file, may be copied, modified, propagated, 
or distributed except according to the terms contained in 
the file ‘LICENSE.txt’.

JupyterLab Notebook demo datalab

This repository provides an example for the creation of an ESA Datalabs JupyterLab datalab based on a repository with Jupyter notebooks (from an external git repository). This assumes there is no Dockerfile. TBC: Expert users can include their own Dockerfile (e.g. to include specific jupyter extensions). Restrictions to be defined

Repository structure

A typical simple repository example is:

  • notebook_01.ipynb --> jupyter notebook: simple GPU notebook.
  • notebook_02.ipynb --> jupyter notebook: simple notebook to explore data products (FITS).
  • fits_info.py --> user provided custom python module (imported in the notebook_02.ipynb)
  • data/ --> folder that could contain some data included in the datalab image (data can also be mounted on /media/data/ at container runtime)
  • requirements.txt --> required python modules to be installed
  • datalab-meta.yml --> pre-filled metadata key:value pairs (used to create the datalab)

The content of the repository is copied into /media/notebooks/datalab_xxxx/ (see below)

Python dependencies

Choose either to include requirements.txt (pip install) or environment.yml (conda install). See also https://github.com/binder-examples/ Note that with repo2docker environment.yml takes precedence over requirements.txt.

The environment.yml file should list all Python libraries on which your notebooks depend, specified as though they were created using the following conda commands:

conda activate example-environment
conda env export --from-history -f environment.yml

The requirements.txt file should list all Python libraries that your notebooks depend on, and they will be installed using:

pip install -r requirements.txt

Local testing

Use repo2docker to create and run a docker image of the notebook repository (automatically creates Dockerfile). repo2docker and Docker need to be installed of course.

jupyter-repo2docker ./

Dockerfile

The Docker image created by the "Create Lab" functionality is based on the pre-approved base image for jupyter labs sepp/jupyterlab_base:latest

A minmial Dockerfile would look like this (see example Dockerfile.example):

  • TO BE VERIFIED *

    FROM sepp/jupyterlab_base:latest ## use one of the pre-approved base images ENV DEBIAN_FRONTEND noninteractive ## --> is this required? COPY ./requirements.txt /tmp/ RUN apt-get update
    && apt-get install -y python3-pip python3-dev
    && pip3 install --upgrade pip
    && pip3 install jupyterlab==2.2.9 ## --> required version?? should this be included in requirements.txt && pip3 install -r /tmp/requirements.txt COPY ./ /media/notebooks/ ## --> Copy by default the entire repository to this folder? RUN rm /media/notebooks/lab-meta.yml ## --> need to remove some files??

Creating and publishing your datalab

To create a new Jupyter Notebook based datalab from this above prepared repository the user shall follow these steps:

  1. Sign-in to ESA Datalabs
  2. Go to your developer area (Flask icon)
  3. Click "create datalab" button to start the datalab creation process
  4. In step 1 of the datalab creation enter the URL of this git repository
  5. In step 2 of the datalab creation select the "Jupyter Notebook" datalab base image / Dockerfile template
  6. Check the python module requirements and metadata have been imported correctly (steps 2 and 3). Update if needed/required (e.g. for mandatory metadata)
  7. Click on "Create" button to start the datalab build 8 Once the datalab has build succesfully and no security issues are found go to the "Action" view to test your datalab (run it using the green play button)
  8. Update your repository and repeat steps 4-8 in case you want to make changes to your datalab (metadata and required python modules can be edited directly in the interface)

Validating and publishing your datalab

Once you're happy with your datalab (see previous section) you're ready to have your datalab validated and published in the the ESA Datalabs datalabs catalogue. Follow these steps:

  1. Submit your datalab for validation in the "validate" tab of the datalab action view (click 'ask for validation' button) (your datalab is now send into moderation and the 'validate' tab will show the different status of the validation process)
  2. Await email notification with the result of the validation (while under moderation the datalab status is "in validation")
  3. If your datalab status is 'validated' you can proceeed to publish it and share it with other users (go to step 5 below)
  4. If your datalab status is 'draft' review the rejection report and take care to make all necessary corrections to your datalab (once done you can resubmit your datalab for validation; step 1 above)
  5. From the 'developer area' select your datalab and select the 'share' tab from the menu
  6. Now you can share the datalab with everyone (public) or with selected users/groups