I have more experience from `Pandas`

and `Scikit-Learn`

Python libraries compared to `Tensorflow`

. I was surprised how large the Tensorflow ecosystem with its ML engineering extensions.

## What is Tensorflow?

Tensorflow is best known as a framework to build artificial neural networks. But you can do any kind numerical computation with it.

The framework has all required tools to setup ML training and serving pipelines. Many design pattern enable efficient parallel processing. This makes Tensorflow ideal for demanding industrial scale projects.

## Tensowflow and Google

Tensorflow is an open source project backed by Google.

The tech company has logical business reason to grow the Tensorflow user base:: You might to end up using more expensive GPUs.

Or even Google developed Tensor Processing Units for large matrix operations.

## Tensorflow and Nvidia

Tensorflow only supports Nvidia GPUs.

`TensorRT`

is an high performance deep learning inference SDK from Nvidia.

## Tensorflow execution logic

An execution in Tensorflow is defined by a `DAG`

(directed asyclic graph). The nodes are mathematical operations. Nodes are connected by edges. In essence, it is a process chart.

Tensorflow has **eager mode** to help debugging operations one by one during development. In production it is better to use graphs.

The
model information is saved
by `tf.Graph`

. Also `tf.function`

decorator adds an operation to the graph. The graph executes a set of operations by `tf.Operation`

. `tf.Tensor`

is the unit of data flowing between the operations.

Tracing lets you record TensorFlow Python operations in the graph.

## Layers of Tensorflow

Tensorflow layer | Description |
---|---|

Hardware | CPU, GPU, TPU, Android |

C++ API | Core Tensorflow |

Python API | Core Tensorflow |

Components | tf.losses, tf.metrics, tf.optimizers etc |

High-level API | tf.estimator, tf.keras, tf.data etc |

Read more about Keras.

## What is a tensor

Tensor rank | Data type | Example |
---|---|---|

0 | Value / Scalar | `4` |

1 | List | `[4, 5]` |

2 | Matrix (table) | `[[4,5], [5,6]]` |

3 | Matrix (3-dim cube) | `[ [[4,5], [5,6]], [[6,7], [7,8]] ]` |

Tensor rank is equal to the number of dimensions. A good memory rule for the number of dimensions: Count of brackets in the beginning is the number of dimensions.

A tensor can be loaded by this:

```
#Can not be modified
tf.constant([4, 5])
#Can be modified
tf.Variable([4, 5])
```

Use `tf.where`

to return only specific tensors. Works similarly then in `Numpy`

. `tf.stack`

could for example combine vectors/columns to a matrix/table.

Read more from introduction to tensors .

## Read data in Tensorflow

Datasets can be created from tensors by `tf.data.Dataset`

API (among the others):

```
tensor = tf.constant([[4,2], [5,3]])
#Dataset contains one tensor
ds1 = tf.data.Dataset.from_tensors(tensor) #returns [[4,2], [5,3]]
#Dataset contains multiple tensors
ds2 = tf.data.Dataset.from_tensor_slices(tensor) #returns [4,2], [5,3]
#Dataset contains multiple sensors
```

Dataset can be created:

- From one or more files
- In memory
- By a data transformation that constructs a dataset from one or more
`tf.data.Dataset`

objects

Here is an example to read sharded files on multiple threads. As an extra `TFRecordDataset`

data is not passed through Python. Presumably this should make it fast:

```
tf.data.TFRecordDataset(files, num_parallel_reads=40)
```

Use `my_dataset.shuffle(100)`

function to randomize the order. The numer `100`

is buffer size. The items are randomize only within this bucket size.

## Dataset optimization methods

TF Dataset optimization | Parallel read | Async read and processing |
---|---|---|

Prefetch | No | Yes |

Sequential interleave | No | No |

Parallel interleave | Yes | Yes |

Read more about Tensorflow data peformance.

## Data transformations in Tensorflow

The input data must be already prepared before Tensorflow model training and predictions. It is not advisable to do transformations like data aggregation or database query in the model code.

Another challenge is that the same transformations should be applied both at training and prediction time.

Some pre-processing tasks require two steps: analysis and transformation. For the min and max the values must be **analyzed from full dataset** in order to perform Min-Max scaling. The second step, the **single record is transformed** between 0 and 1 by the analyzed min and max values.

There are couple of solutions. If the pre-processing requires the analysis step, Google Cloud recommends a
Dataflow pipeline. It can save the pre-processed data in `TFRecords`

format.

For transformations only, `tf.Transform`

API is enough. It is able to scan the data above the single record. The upside is the speed of Tensorflow and the downside would be possibly limited functionality. The pre-processing function is the most important concept of `tf.Transform`

.

## Feature preparation

Tensorflow shines in providing robust platform to build ML models from complex datasets.

Neural networks overall are able to ingest the data in raw format such as time series or image pixels. Many traditional methods might require much more feature prepration such as aggregations. This makes especially the predictions more straightforward as extensive data transformation pipelines are not needed.

Use `tf.feature_columns`

API. It converts input tensors suitable for neural network input. The `normalizer_fn`

argument can be useful for normalization during data pre-processing.

Tensorflow does not need to convert categorical values or sparse tensors them in dense format which saves memory. `tf.feature_columns.embedding_column`

would convert a sparse categorical column to a lower dimensional dense vector.

Features columns are ingested in dictionary format.

## Distributed Tensorflow training

The fact that Tensorflow can take in raw data and train models by parallel computation makes it serious competitor for Spark applications. Use `tf.distribute.Strategy`

.

Tensorflow distribution strategy | When to use | Synchronous |
---|---|---|

Mirrored strategy | Single machine with multiple GPU devices | Yes |

Multi-worker mirrored strategy | Same than mirrored but multiple workers | Yes |

TPU strategy | Similar to Spark: Read from Cloud Storage | Yes |

Parameter server strategy | Cluster with workers and parameter servers | Asynchronous |

Strategy is defined in code level in this kind of workflow:

```
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
#define model
#compile model
#fit the model
```

`TF_CONFIG`

environment variable is used in virtual machines participating to distributed job. Configure it for custom distribution strategies. For example each neural network layer can be ran on parallel (a model-parallel approach).

## Gradients

`tf.GradientTape`

is required for gradient calculation in eager execution.

Gradient tape might also be helpful when computing integrated gradients for feature importances.

## Speed optimizations in Tensorflow

**Pre-fetching**

In an iteration of multi step flow, start preparing next dataset while the previous one is being processed.

**Multithreading**

The common programming paradigm is also available in Tensorflow.

**Post-training quantization**

Models can be optimized after training. Post-training quantization is recommended to decrease serving latency. Model size is also reduced. Quantization means converting to lower precision, eg 32 bit floats to 8 bit integers.

My understanding is that this requires basicly no code changes.

**Reduced precision**

Lower precision floating point numbers decrease training curve convergence time while keeping the same accuracy.

## Tensowflow Lite

Lighter version of Tensorflow to run in devices like phones.

It sacrifices some computational precision for edge portability.

For example Android developers have an inference library to make predictions in mobile apps.

## Image processing in Tensorflow

`tf.image`

API provides functionalities to resize images, padding for convolution, draw bounding boxes. You can also adjust brightness, contrast and make the image grayscale.

This toolkit is useful for both data pre-processing and augmentation.

## TensorFlow Enterprise

Tensorflow Enterprise is a commercial version of the open-source core product. The Enterprise framework is targeted for large customers in Google Cloud and is tied to the free version but has additional capabilities.

In Google Cloud Tensorflow Enterprise is integrated to Deep learning VM images, Deep learning containers, Notebooks and Vertex AI training.

Help for engineering problems available.

## Logging and debugging in Tensorflow

TensorBoard Debugger V2 is a convenient way to log and debug execution information.

Set Tensorflow logging level by `TF_CPP_MIN_LOG_LEVEL`

environment variable.

## Tensorflow libraries

Multiple libraries extend Tensorflow. They are all under the official Tensorflow GitHub account .

Tensorflow library | Description | Library name in PIP |
---|---|---|

Tensorboard | Visualize ML experimentations. Training metric, execution grpah, hardware etc. Not for inference. | tensorboard |

Tensorflow Profiler | Tracks performance of the models. Understand CPU and GPU resources consumption in Tensorflow operations. | tensorboard-plugin-profile (requires Tensorboard) |

Tensorflow Probability | Combine probabilistic models to deep leaning and powerful hardware. | tensorflow-probability |

Tensorflow Ranking | Develop learning to rank (LTR) models. | tensorflow-ranking |

Tensorflow Datasets | Find ready to use datasets. | tensorflow-datasets |

Tensorflow Recommenders | Build recommender systems. | tensorflow-recommenders |

Tensorflow I/O | File systems and formats not available in core Tensorflow. Eg Parquet. | tensorflow-io |

## Tensorflow Extended

Tensorflow Extended aka TFX is a framework on top of Tensorflow for data pre-processing, model operationalitzation and deployment.