A one-click Jupyter install example for the GLAM Workbench

Hi all,

I’ve been working on providing a one-click Reclaim Cloud installation option for users of the GLAM Workbench. I thought I’d share it here in case it was of use to others. (With thanks to @psychemedia for his initial experiments with Jupyter documented here.

The GLAM Workbench consists of about 40 GitHub repositories, each containing a collection of Jupyter notebooks. The repositories include configuration files that mean they can be spun up using Binder. Binder is a great service, but it has limitations – in particular, the environments it creates are not persistent, so any data you gather or notebooks you modify will not be saved. It’s a great environment for exploration, but what happens when users want to take the next step and do serious and sustained work? It’s a big jump from using Binder in the cloud to setting up Python/Jupyter on your own computer. Enter Reclaim Cloud!

My plan is to provide a one-click installer that spins up a fully operational environment in Reclaim Cloud for Workbench users who want a persistent environment and don’t mind paying for it. I think this will nicely fill the gap for users who want to do serious work, but don’t quite have the confidence/experience to manage ta local setup.

I’ve currently got this working in the Trove Newspaper Harvester repository. You’ll see that there’s ‘Launch Reclaim Cloud’ button, just like the ‘Launch Binder’ one.

There’s a few components needed to make this happen.

I’m using the Repo2docker GitHub action to generate a Docker image when I push changes to the main branch. Binder uses Repo2Docker behind the scenes, so I don’t need to supply any extra configuration to make this work, it just reads the Binder config files (requirements.txt and postBuild). This action also uploads the image to the Docker Hub.

When you click the ‘Launch Reclaim Cloud’ button, you send the file reclaim-manifest.jps to Reclaim Cloud. This file points to the latest Docker image on Docker Hub, and configures the Reclaim environment. This file uses the Jelastic Cloud Scripting language.

There were a couple of tricky things that took a while to work out. I wanted to ask users to set a password for Jupyter on installation. So I had to add a password field to the installation dialogue, encode the password, and then feed that password to Jupyter. I also wanted to change the entry point command to launch Jupyter Lab, rather than the classic notebook interface. Finally I wanted to open Lab and display a default ‘index.md’ page. All that is accomplished here:

onInstall:
    - cmd[cp]: python3 -c "from notebook.auth import passwd; print(passwd('${settings.jupyterPassword}', 'sha1'))"
    - api:
        - method: environment.control.SetContainerRunCmd
          params:
            nodeId: ${nodes.cp[0].id}
            data: "jupyter lab --ip 0.0.0.0 --NotebookApp.password='${response.out}' --LabApp.default_url='/lab/tree/index.md'"
        - method: environment.control.RestartNodes
          params:
            nodeGroup: cp

It wasn’t at all obvious from the Jelastic documentation how to get the output of the python command (it’s just ${response.out}). Nor was it obvious that I had to use the Jelastic API to change the run command and restart the node, but I got there in the end.

So now I’ve got this working in one repository, my plan is to move ahead and add it to the other 40! I also need to add some additional documentation to the main GLAM Workbench site. But I’m pretty excited about this and what it adds. One of my main aims with the GLAM Workbench is to encourage researchers with limited digital skills to start playing around with GLAM data, and I think having the Reclaim Cloud option will really help!

[Ugh Discourse won’t let me include more than 2 links in a post, so I’ll try to add them in comments…]

2 Likes

Here are some more links:

1 Like

And:

1 Like

I’ve been a fan of the GLAM Workbench idea for ages. Great to see how you’ve worked this through to a pattern that makes things 1-click and relatively straightforward set-up available on Reclaim Cloud:-)

–tony

2 Likes

I think this is amazing. I spun up Glam based on this script:
https://github.com/GLAM-Workbench/glam-workbench.github.io/discussions/17

I am realizing your reclaim-manifest.jps file is the final version of that, and this means we can make the GLAM tools 1-click installers in the Reclaim Cloud marketplace, which is something I am happy to do if that’s cool with you (figure the more ways at it the better).

Also, I really appreciate your work here because the documentation you did around GLAM allowed me to start wrapping my head around Jupyter Lab, and I may even play with Voyant here soon to make some sense of the 17,000 articles around VHS stores I know have copies of :slight_smile:

Your work is pretty awesome, and we are honored you are doing a bit of it through Reclaim Cloud, thank you so much, and now for the long overdue blog post…

1 Like

Thanks Jim! Happy to have them in the Marketplace, but all up there’ll be about 40 different manifests – one for each repo. Would you want all of them? What’s involved in adding them?

Good question, will need to chat with Tim given that may be overload, but are there one or two gateway drug installers that might get folks up and running with the most popular tool/s?

I could see a scenario where there’s a single app installer that has a dropdown selection of any of the 40 available tools to select from and it spins up the selected one? I’ve never done that sort of thing before but certainly something we could experiment with.

That sounds cool. We could certainly just start with a limited selection – probably a couple of Trove and the web archives ones.