Hacking for Science

Block 3, Session 1: A Glimpse of DevOps

Matt Bannert (@whatsgoodio)

Today’s Goals

  • Catch a Glimpse of Development and Operations
  • Terminology & Context
  • Get an idea of apt infrastructure for team projects

Resources –
And What to Look For …

Size

Persistency

Availability & Exposure

Data Science Webserver Example

Data Science Webserver Example

Common Servers

name common ports description
Apache 80, 443 Basic Webserver
nginx 80, 443 Webserver, Reverse Proxy
Postgres 5432 Database
RStudio Server 8787 RStudio made available through a web browser
Shiny Server 3838 A webserver for shiny apps

Reproducibility

Hosting Options

Do not forget that hosting requires a development strategy.

On Premise (in house)

SaaS (software as a service)

SaaS (software as a service) / Serverless

Advantages

  • Hassle free (Hosting)
  • Onboarding of non-hackers easier
  • transparent pricing models

Disadvantages

  • Blackbox
  • Vendor Lock-in depending on pricing model and software
  • Relatively Expensive per Unit

The Cloud

Common Cloud Products

Basic VMs

e.g., Google Compute Engine, Microsoft Azure Cloud VMs

Single Purpose Environments

Docker hosts, e.g., Google Kubernetes Engine, Azure Kubernetes Service (AKS)

Ready Made Services

AI & machine learning products, e.g., Google Cloud AutoML, SQL Cloud Hosting

Containers - Building Blocks of Modern Infrastructure

A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another

–docker.com, what is a container ?

Excursion: Docker in one Slide

  • single purpose, application focused virtualization
  • images: blueprints for containers
  • Registries: store images
  • Docker files are text based configs from which images are created.
  • Images can be stacked, so we can build on existing images
  • Docker containers run on a Docker Host / Docker Desktop or in Docker Cluster like Docker Swarm or Kubernetes.

A Basic Docker File: R with Postgres Driver

FROM rocker/r-ver:4.2.0 as deps
RUN apt-get update && apt-get install -y \
libpq-dev \
libcurl4-openssl-dev \
libxml2-dev \
libssl1.0-dev \
libssh-dev

Start and log into the container…

# start the container in the background
docker run -it -d  --name pgr_container pgr

# log into the running container, use bash
docker exec -it pgr_container /bin/bash

What Is Docker Good for?

Test Stack, Conserve Setups

e.g., PostgreSQL version on docker hub

Run Applications w/o Side Effects

Develop @home, Run in the Cloud

Two Docker Based Examples

Shiny Server

 docker run --rm -p 1234:3838 rocker/shiny

Postgres Server

docker run --rm  --name pg-docker -e POSTGRES_PASSWORD=postgres -d -p 1111:5432 
-v local/path:/var/lib/postgresql/data  postgres:11

-d: run as daemon, i.e., terminal window available

-e: pass on an environment parameter, in this case a password

-p: port forwarding: host port:docker port

-v: Mount for persistent storage

(!): To make the two containers talk to each other, consider docker compose.