I have more experience from
Scikit-Learn Python libraries compared to
Tensorflow. I was surprised how large the Tensorflow ecosystem with its ML engineering extensions.
What is Tensorflow?
Tensorflow is best known as a framework to build artificial neural networks. But you can do any kind numerical computation with it.
The framework has all required tools to setup ML training and serving pipelines. Many design pattern enable efficient parallel processing. This makes Tensorflow ideal for demanding industrial scale projects.
Tensowflow and Google
Tensorflow is an open source project backed by Google.
The tech company has logical business reason to grow the Tensorflow user base:: You might to end up using more expensive GPUs.
Or even Google developed Tensor Processing Units for large matrix operations.
Tensorflow and Nvidia
Tensorflow only supports Nvidia GPUs.
TensorRT is an high performance deep learning inference SDK from Nvidia.
Tensorflow execution logic
An execution in Tensorflow is defined by a
DAG (directed asyclic graph). The nodes are mathematical operations. Nodes are connected by edges. In essence, it is a process chart.
Tensorflow has eager mode to help debugging operations one by one during development. In production it is better to use graphs.
model information is saved
tf.function decorator adds an operation to the graph. The graph executes a set of operations by
tf.Tensor is the unit of data flowing between the operations.
Tracing lets you record TensorFlow Python operations in the graph.
Layers of Tensorflow
|Hardware||CPU, GPU, TPU, Android|
|C++ API||Core Tensorflow|
|Python API||Core Tensorflow|
|Components||tf.losses, tf.metrics, tf.optimizers etc|
|High-level API||tf.estimator, tf.keras, tf.data etc|
Read more about Keras.
What is a tensor
|Tensor rank||Data type||Example|
|0||Value / Scalar|
|3||Matrix (3-dim cube)|
Tensor rank is equal to the number of dimensions. A good memory rule for the number of dimensions: Count of brackets in the beginning is the number of dimensions.
A tensor can be loaded by this:
#Can not be modified tf.constant([4, 5]) #Can be modified tf.Variable([4, 5])
tf.where to return only specific tensors. Works similarly then in
tf.stack could for example combine vectors/columns to a matrix/table.
Read more from introduction to tensors .
Read data in Tensorflow
Datasets can be created from tensors by
tf.data.Dataset API (among the others):
tensor = tf.constant([[4,2], [5,3]]) #Dataset contains one tensor ds1 = tf.data.Dataset.from_tensors(tensor) #returns [[4,2], [5,3]] #Dataset contains multiple tensors ds2 = tf.data.Dataset.from_tensor_slices(tensor) #returns [4,2], [5,3] #Dataset contains multiple sensors
Dataset can be created:
- From one or more files
- In memory
- By a data transformation that constructs a dataset from one or more
Here is an example to read sharded files on multiple threads. As an extra
TFRecordDataset data is not passed through Python. Presumably this should make it fast:
my_dataset.shuffle(100) function to randomize the order. The numer
100 is buffer size. The items are randomize only within this bucket size.
Dataset optimization methods
|TF Dataset optimization||Parallel read||Async read and processing|
Read more about Tensorflow data peformance.
Data transformations in Tensorflow
The input data must be already prepared before Tensorflow model training and predictions. It is not advisable to do transformations like data aggregation or database query in the model code.
Another challenge is that the same transformations should be applied both at training and prediction time.
Some pre-processing tasks require two steps: analysis and transformation. For the min and max the values must be analyzed from full dataset in order to perform Min-Max scaling. The second step, the single record is transformed between 0 and 1 by the analyzed min and max values.
There are couple of solutions. If the pre-processing requires the analysis step, Google Cloud recommends a
Dataflow pipeline. It can save the pre-processed data in
For transformations only,
tf.Transform API is enough. It is able to scan the data above the single record. The upside is the speed of Tensorflow and the downside would be possibly limited functionality. The pre-processing function is the most important concept of
Tensorflow shines in providing robust platform to build ML models from complex datasets.
Neural networks overall are able to ingest the data in raw format such as time series or image pixels. Many traditional methods might require much more feature prepration such as aggregations. This makes especially the predictions more straightforward as extensive data transformation pipelines are not needed.
tf.feature_columns API. It converts input tensors suitable for neural network input. The
normalizer_fn argument can be useful for normalization during data pre-processing.
Tensorflow does not need to convert categorical values or sparse tensors them in dense format which saves memory.
tf.feature_columns.embedding_column would convert a sparse categorical column to a lower dimensional dense vector.
Features columns are ingested in dictionary format.
Distributed Tensorflow training
The fact that Tensorflow can take in raw data and train models by parallel computation makes it serious competitor for Spark applications. Use
|Tensorflow distribution strategy||When to use||Synchronous|
|Mirrored strategy||Single machine with multiple GPU devices||Yes|
|Multi-worker mirrored strategy||Same than mirrored but multiple workers||Yes|
|TPU strategy||Similar to Spark: Read from Cloud Storage||Yes|
|Parameter server strategy||Cluster with workers and parameter servers||Asynchronous|
Strategy is defined in code level in this kind of workflow:
strategy = tf.distribute.MirroredStrategy() with strategy.scope(): #define model #compile model #fit the model
TF_CONFIG environment variable is used in virtual machines participating to distributed job. Configure it for custom distribution strategies. For example each neural network layer can be ran on parallel (a model-parallel approach).
tf.GradientTape is required for gradient calculation in eager execution.
Gradient tape might also be helpful when computing integrated gradients for feature importances.
Speed optimizations in Tensorflow
In an iteration of multi step flow, start preparing next dataset while the previous one is being processed.
The common programming paradigm is also available in Tensorflow.
Models can be optimized after training. Post-training quantization is recommended to decrease serving latency. Model size is also reduced. Quantization means converting to lower precision, eg 32 bit floats to 8 bit integers.
My understanding is that this requires basicly no code changes.
Lower precision floating point numbers decrease training curve convergence time while keeping the same accuracy.
Lighter version of Tensorflow to run in devices like phones.
It sacrifices some computational precision for edge portability.
For example Android developers have an inference library to make predictions in mobile apps.
Image processing in Tensorflow
tf.image API provides functionalities to resize images, padding for convolution, draw bounding boxes. You can also adjust brightness, contrast and make the image grayscale.
This toolkit is useful for both data pre-processing and augmentation.
Tensorflow Enterprise is a commercial version of the open-source core product. The Enterprise framework is targeted for large customers in Google Cloud and is tied to the free version but has additional capabilities.
In Google Cloud Tensorflow Enterprise is integrated to Deep learning VM images, Deep learning containers, Notebooks and Vertex AI training.
Help for engineering problems available.
Logging and debugging in Tensorflow
TensorBoard Debugger V2 is a convenient way to log and debug execution information.
Set Tensorflow logging level by
TF_CPP_MIN_LOG_LEVEL environment variable.
Multiple libraries extend Tensorflow. They are all under the official Tensorflow GitHub account .
|Tensorflow library||Description||Library name in PIP|
|Tensorboard||Visualize ML experimentations. Training metric, execution grpah, hardware etc. Not for inference.||tensorboard|
|Tensorflow Profiler||Tracks performance of the models. Understand CPU and GPU resources consumption in Tensorflow operations.||tensorboard-plugin-profile (requires Tensorboard)|
|Tensorflow Probability||Combine probabilistic models to deep leaning and powerful hardware.||tensorflow-probability|
|Tensorflow Ranking||Develop learning to rank (LTR) models.||tensorflow-ranking|
|Tensorflow Datasets||Find ready to use datasets.||tensorflow-datasets|
|Tensorflow Recommenders||Build recommender systems.||tensorflow-recommenders|
|Tensorflow I/O||File systems and formats not available in core Tensorflow. Eg Parquet.||tensorflow-io|
Tensorflow Extended aka TFX is a framework on top of Tensorflow for data pre-processing, model operationalitzation and deployment.