name: inverse layout: true class: center, middle, inverse --- # Virtualization ## How to setup the right computation environment ### ~60min --- layout: false ##
Outline
- ### Learning objectives - ### Requirements - ### Introduction - ### Tools - ### Exercises --- ##
Learning objectives
- ### Why do we use virtualization? - ### What are the various types of virtualization? - ### How to use `venv` and `conda`? - ### How to use `Docker`? -- ##
Requirements
- ### Your computer: `venv`, `pip`, `conda`, `Docker` - ### You: `shell` / `Terminal` --- name: inverse layout: true class: center, middle, inverse --- # Introduction --- layout: false ###
Virtualization technologies
- ### Isolate the computing environments - ### Provide a mechanism to encapsulate environments in a self-contained unit that can run anywhere --- ###
Why do we need virtualization?
### Science Reproducibility - Each project in a lab depends on complex software environments - operating system - drivers - software dependencies: Python/MATLAB/R + libraries -- - We try to avoid - the computer I used was shut down a year ago, can’t rerun the results from my publication... - the analysis were run by my student, have no idea where and how... - etc. --- ###
Why do we need virtualization?
### Collaboration with your colleagues - Sharing your code or using a repository might not be enough -- - We try to avoid - well, I forgot to mention that you have to use Clang, gcc never worked for me... - don’t see any reason why it shouldn’t work on Windows...(I actually have no idea about Windows, but won’t say it...) - **it works on my computer...** - etc. --- ###
Why do we need virtualization?
### Freedom to experiment! - Universal Install Script from xkcd: *The failures usually don’t hurt anything...* And usually all your old programs work...
- We try to avoid - I just want to undo the last five hours of my life... --- name: inverse layout: true class: center, middle, inverse --- # Tools --- layout: false ###
Types of virtualization
#### Three main types - pure python virtualization - venv - conda - Containers: - Docker - Singularity - Virtual Machines: - Virtualbox - VMware - AWS, Google Compute Engine, ... --- ####
Main ideas
- Isolate the computing environment - Allow reconstructing computing environments - Allow sharing your computing environments
--- ###
Virtualization in python - environments using `venv` and `conda`
#### Virtual environments - keep the dependencies required by different projects in separate places - allows to work with specific versions of libraries or Python itself without affecting other Python projects #### [venv](https://docs.python.org/3/library/venv.html) - an environment manager for python 3.4 and up - preinstalled #### [Conda](https://conda.io/docs/) - A package manager and an environment manager - `conda` is quickest installed with [Miniconda](https://conda.io/miniconda.html). Miniconda installes only the most basic python packages. - If you want a conda environment with many usefull packages already preinstalled, we recommend to download [Anaconda](https://www.anaconda.com/download). --- ### How to use conda? ```bash # Updating conda conda update conda # List available Python version conda search "^python$" # Creating a Python 3.6 environment and install some basic packages conda create -n my_new_py36 python=3.6 numpy scipy matplotlib # Create a new environment using a yml-file conda env create --name neuro --file /path/to/file/environment.yml # Export conda environment (first activate it) conda env export # Remove unused packages and caches conda clean -tipsy # Activating the environment source activate my_new_py36 # Deactivating the environment source deactivate my_new_py36 # List installed conda environments conda env list # Remove conda environment conda remove --name my_new_py36 --all ``` --- ###
Containers - Docker
- leading software container platform - an open-source project - **it runs on Linux, Mac OS X and Windows** -- - share the host system’s kernel with other containers - each container gets its own isolated user space - only bins and libs are created from scratch - **containers are very lightweight and fast to start up** -- #### Testing your Docker installation: ```bash docker run hello-world ``` -- #### Interesting tutorials and blog posts: - [A beginner friendly intro to VMs and Docker](https://medium.freecodecamp.com/a-beginner-friendly-introduction-to-containers-vms-and-docker-79a9e3e119b#.3giab6wvo) - [Intro to Docker from Neurohackweek](https://neurohackweek.github.io/docker-for-scientists/) - [Understanding Images](https://code.tutsplus.com/tutorials/docker-from-the-ground-up-understanding-images--cms-28165) --- ## Why is this so great? - A docker container is a closed environment and will use the exact same software versions, drivers etc. independent from the system it runs on. -- - [Docker Hub](https://hub.docker.com/): Free repositories to share Docker Images -- - [Neurodocker](https://github.com/kaczmarj/neurodocker): Easy and quick way to generate custom Docker images for your neuroimaging environment ```bash docker run \ --rm kaczmarj/neurodocker:master generate docker \ --base neurodebian:stretch-non-free \ --pkg-manager apt \ --install ants fsl \ --add-to-entrypoint "source /etc/fsl/fsl.sh" \ --spm12 version=dev \ --user=neuro \ --miniconda miniconda_version="4.3.31" \ conda_install="python=3.6 jupyter numpy scipy matplotlib" \ pip_install="nilearn https://github.com/nipy/nipype/tarball/master" \ create_env="neuro" \ activate=True \ --workdir /home/neuro > Dockerfile ``` --- ### Short Summary of Docker commands - Part 1 - Managing images: ```bash # Download docker container 'ubuntu' docker pull ubuntu # Show all installed docker container docker images # remove images docker rmi
``` - Running containers ```bash docker run ubuntu echo "hello from your container" ``` - Use `-it` option to run interactively ```bash docker run -it ubuntu bash ``` --- ### Short Summary of Docker commands - Part 2 - Managing containers ```bash # list currently running containers docker ps # list created containers docker ps -a # remove containers docker rm
# remove all stoped containers docker rm $(docker ps -a -q) ``` - Use `--rm` option to automatically remove container when it exits ```bash docker run -it --rm ubuntu ``` - Export and Import docker containers ```bash # Export docker image docker save -o ubuntu_container.tar ubuntu # Import docker image on another PC docker load --input ubuntu_container.tar ``` --- ### Short Summary of Docker commands - Part 3 - Mount one (or many) folders from your own OS inside the docker image ```bash # Mount local 'my_folder' inside docker container # Note: Folder will be created if it doesn't exist or overwrits if present docker run -it --rm -v /path/to/my_folder:/my_folder ubuntu # Mount folder in read only mode docker run -it --rm -v /path/to/my_folder:/my_folder:ro ubuntu ``` -- - for more on Docker, please have a look at this [one-day workshop](https://github.com/PeerHerholz/docker_workshop) --- ###
Containers - Docker and Singularity
#### Docker: - docker can escalate privileges, so you can be effectively treated as a root on the host system - this is usually not supported by administrators from HPC centers -- #### Singularity: - a container solution created for scientific and application driven workloads - supports existing and traditional HPC resources - a user inside a Singularity container is the same user as outside the container - but you can use Vagrant to create a container (you have root privileges on your VM!) - can run (and modify!) existing Docker containers ```bash sudo singularity shell --writable
``` - running VM is required on OSX - [Satra's presentation](http://satra.cogitatum.org/om-images/) - [other tutorials](http://singularity.lbl.gov/tutorials) --- ###
Virtualization - Virtual Machines vs Containers
-- **Virtual Machines** - emulate whole computer system (software+hardware) - run on top of a physical machine using a *hypervisor* - *hypervisor* shares and manages hardware of the host and executes the guest operating system - guest machines are completely isolated and have dedicated resources --- ###
Virtualization - Virtual Machines vs Containers
-- **Docker** - share the host system’s kernel with other containers - each container gets its own isolated user space - only bins and libs are created from scratch - **containers are very lightweight and fast to start up** --- name: inverse layout: true class: center, middle, inverse --- # Questions?