Environments: Containers for Computation¶
An Environment is an operating system with a programming language and any packages needed to run an experiment. When creating a Tale, it is important to pick the Environment that is right for your experiment. Whole Tale provides a set of default environments with RStudio and Jupyter Notebook-and also allows you to create your own.
The default Environment configurations are open source and can be found on the Whole Tale GitHub page.
The Jupyter Notebook Environment runs on Ubuntu Core 14.04 which includes CUDA and Theano. It is meant to be a minimal high performance Python Environment, which can be extended by installing packages with:
! pip install --user <package>
Refer to the Advanced section below to learn how to find the Environment’s username and other parameters that may help
Packages that are bundled with this Environment include
- GNU Fortran
For more information and technical details, visit the GitHub repositories below
Jupyter Notebook with Spark¶
Whole Tale also caters to projects that utilize cluster-computing technology with Apache Spark. The Jupyter with Spark Environment runs on Ubuntu 16.04.4 and comes bundled with
- Jupyter Notebook 5.2.x
- Spark 2.2.0
- Hadoop 2.7
- Conda Python 3.x
- Conda R 3.3.x
- Scala 2.11.x
- Mesos Client 1.2
- R Packages: ggplot2 and rcurl
- Python Packages: pyspark, pandas, matplotlib, scipy, seaborn, and scikit-learn
This Environment is based off of the all-spark-notebook, provided by Jupyter.
For more information and technical details, visit
The RStudio Environment runs on Debian 8.11 and includes the following packages
- R 3.4.1
- GNU Fortran
Additional packages can be installed the usual way.
The OpenRefine environment runs on Ubuntu and includes
- OpenRefine 2.8
- OpenJDK 8