# HOWTO install ML apps

This HOWTO is about installing ML apps _under Ubuntu LTS_,
and in particular it is limited to _TensorFlow_ and _PyTorch_.

## Common information

Typically ML models are created and trained using some kind of
ML framework library (TensorFlow, PyTorch, Caffe/Caffe2, ...).

*   Please use only [PIP](howtoSwPip.md) as suggested to install the
    frameworks, unless you have very old or very new models and code you
    don't need Python virtual environment or Anaconda environments.
*   In general currently (early 2020) please use Python **3.6** as it is
    the version that is most compatible across Ubuntu versions and
    frameworks.

Most frameworks can run most of their computations on GPUs or entirely
on CPUs, which is usually much slower; but many computations still run
on the CPU even if the GPU is available, as the GPUs have a very narrow
set of computations for which they are much faster, usually those with a
lot of FMA [(fused multiply add)]().

Both versions of the frameworks often need great care in streaming data
from CPU memory to GPU memory and back. Often copying data from CPU
memory to GPU memory is much faster than from GPU to CPU memory.

### GPU versions of frameworks

The situation here:

*   We don't use (yet) AMD cards and libraries like MIOpen or ROCm.
*   The NVIDIA driver, CUDA and cuDNN are preinstalled on every system
    in a shared location in versions that should be compatible with most
    recent versions of TensorFlow and PyTorch. These versions are
    somewhat negotiable, so if you need a different version please ask.
*   Please don't install using Anaconda etc. your own versions of
    CUDA or cuDNN.

These usually are the top element of a stack of software libraries that
usually looks like this

*   ML framework.
*   Python 3.
*   [cuDNN]() or [MIOpen](https://rocm.github.io/miopen.html).
*   [CUDA]() or [ROCm](https://rocm.github.io/dl.html0).
*   `nvidia-uvm`+`nvidia` driver or `amdgpu` driver.

### CPU versions of frameworks

For CPU oriented computations the frameworks depend usually mostly on
the `numpy` Python module, which depends in turn on various
implementations of compiled and linear algebra APIs and packages, but
sometimes use directly those installed on the system, usually
[various implementations]() of the [BLAS API]().

*   There are several ways to compile some C/C++ shared objects that are
    part of most frameworks (and of `numpy`) as there are many flavours
    of CPUs. The precompiled versions in the PIP repositories are not
    compiled for all possible CPU types, and sometimes they need to be
    compiled from sources for best performance on a specific CPU model.
*   The CPU modelss that have the [AVX]() extended instructions are
    usually much faster at training models than the older ones (often by
    a factor of twenty).
*   A CPU with 32 cores and AVX is often only five-six times slower than
    a top-end GPU.
*   The difference in speed between a mid-range and a top-end GPU of the
    same generation is usully not enormous (less than a factor of two),
    the bigger difference is that NVIDIA restricts mid-range GPUs to
    smaller memory sizes, which sometimes matter more.

## TensorFlow

There are two major editions of
[TensorFlow](https://www.tensorflow.org/overview), TensorFlow 1 and
TensorFlow 2, and 2 is much more flexible and recommended. Each comes in
two packaged variants, `tensorflow_gpu` and `tensorflow`, where the
latter does not do GPU acceleration.

* TensorFlow is usually used via
  [Keras](https://www.tensorflow.org/guide/keras/overview).

### TensorFlow GPU version

*   The GPU version also has the same aspects of the CPU version because
    not every computation is transferred to the GPU.
*   Each release is dependent on a specific range of CUDA versions,
    which in turn depends on a specific range of NVIDIA driver versions.

### TensorFlow CPU version

## [PyTorch]

This is the Python equivalent to the Torch C++ framework, which is
rarely used. The Python module however is `torch`, not `pytorch`.
PyTorch is usually installed with the `torchvision` package, which
contains some higher level APIs for ML image/video processing.

*   List of
    [PyTorch installation "wheels"](https://download.pytorch.org/whl/torch_stable.html).