I have more experience from Pandas
and Scikit-Learn
Python libraries compared to Tensorflow
. I was surprised how large the Tensorflow ecosystem with its ML engineering extensions.
What is Tensorflow?
Tensorflow is best known as a framework to build artificial neural networks. But you can do any kind numerical computation with it.
The framework has all required tools to setup ML training and serving pipelines. Many design pattern enable efficient parallel processing. This makes Tensorflow ideal for demanding industrial scale projects.
Tensowflow and Google
Tensorflow is an open source project backed by Google.
The tech company has logical business reason to grow the Tensorflow user base:: You might to end up using more expensive GPUs.
Or even Google developed Tensor Processing Units for large matrix operations.
Tensorflow and Nvidia
Tensorflow only supports Nvidia GPUs.
TensorRT
is an high performance deep learning inference SDK from Nvidia.
Tensorflow execution logic
An execution in Tensorflow is defined by a DAG
(directed asyclic graph). The nodes are mathematical operations. Nodes are connected by edges. In essence, it is a process chart.
Tensorflow has eager mode to help debugging operations one by one during development. In production it is better to use graphs.
The
model information is saved
by tf.Graph
. Also tf.function
decorator adds an operation to the graph. The graph executes a set of operations by tf.Operation
. tf.Tensor
is the unit of data flowing between the operations.
Tracing lets you record TensorFlow Python operations in the graph.
Layers of Tensorflow
Tensorflow layer | Description |
---|---|
Hardware | CPU, GPU, TPU, Android |
C++ API | Core Tensorflow |
Python API | Core Tensorflow |
Components | tf.losses, tf.metrics, tf.optimizers etc |
High-level API | tf.estimator, tf.keras, tf.data etc |
Read more about Keras.
What is a tensor
Tensor rank | Data type | Example |
---|---|---|
0 | Value / Scalar | 4 |
1 | List | [4, 5] |
2 | Matrix (table) | [[4,5], [5,6]] |
3 | Matrix (3-dim cube) | [ [[4,5], [5,6]], [[6,7], [7,8]] ] |
Tensor rank is equal to the number of dimensions. A good memory rule for the number of dimensions: Count of brackets in the beginning is the number of dimensions.
A tensor can be loaded by this:
#Can not be modified
tf.constant([4, 5])
#Can be modified
tf.Variable([4, 5])
Use tf.where
to return only specific tensors. Works similarly then in Numpy
. tf.stack
could for example combine vectors/columns to a matrix/table.
Read more from introduction to tensors .
Read data in Tensorflow
Datasets can be created from tensors by tf.data.Dataset
API (among the others):
tensor = tf.constant([[4,2], [5,3]])
#Dataset contains one tensor
ds1 = tf.data.Dataset.from_tensors(tensor) #returns [[4,2], [5,3]]
#Dataset contains multiple tensors
ds2 = tf.data.Dataset.from_tensor_slices(tensor) #returns [4,2], [5,3]
#Dataset contains multiple sensors
Dataset can be created:
- From one or more files
- In memory
- By a data transformation that constructs a dataset from one or more
tf.data.Dataset
objects
Here is an example to read sharded files on multiple threads. As an extra TFRecordDataset
data is not passed through Python. Presumably this should make it fast:
tf.data.TFRecordDataset(files, num_parallel_reads=40)
Use my_dataset.shuffle(100)
function to randomize the order. The numer 100
is buffer size. The items are randomize only within this bucket size.
Dataset optimization methods
TF Dataset optimization | Parallel read | Async read and processing |
---|---|---|
Prefetch | No | Yes |
Sequential interleave | No | No |
Parallel interleave | Yes | Yes |
Read more about Tensorflow data peformance.
Data transformations in Tensorflow
The input data must be already prepared before Tensorflow model training and predictions. It is not advisable to do transformations like data aggregation or database query in the model code.
Another challenge is that the same transformations should be applied both at training and prediction time.
Some pre-processing tasks require two steps: analysis and transformation. For the min and max the values must be analyzed from full dataset in order to perform Min-Max scaling. The second step, the single record is transformed between 0 and 1 by the analyzed min and max values.
There are couple of solutions. If the pre-processing requires the analysis step, Google Cloud recommends a
Dataflow pipeline. It can save the pre-processed data in TFRecords
format.
For transformations only, tf.Transform
API is enough. It is able to scan the data above the single record. The upside is the speed of Tensorflow and the downside would be possibly limited functionality. The pre-processing function is the most important concept of tf.Transform
.
Feature preparation
Tensorflow shines in providing robust platform to build ML models from complex datasets.
Neural networks overall are able to ingest the data in raw format such as time series or image pixels. Many traditional methods might require much more feature prepration such as aggregations. This makes especially the predictions more straightforward as extensive data transformation pipelines are not needed.
Use tf.feature_columns
API. It converts input tensors suitable for neural network input. The normalizer_fn
argument can be useful for normalization during data pre-processing.
Tensorflow does not need to convert categorical values or sparse tensors them in dense format which saves memory. tf.feature_columns.embedding_column
would convert a sparse categorical column to a lower dimensional dense vector.
Features columns are ingested in dictionary format.
Distributed Tensorflow training
The fact that Tensorflow can take in raw data and train models by parallel computation makes it serious competitor for Spark applications. Use tf.distribute.Strategy
.
Tensorflow distribution strategy | When to use | Synchronous |
---|---|---|
Mirrored strategy | Single machine with multiple GPU devices | Yes |
Multi-worker mirrored strategy | Same than mirrored but multiple workers | Yes |
TPU strategy | Similar to Spark: Read from Cloud Storage | Yes |
Parameter server strategy | Cluster with workers and parameter servers | Asynchronous |
Strategy is defined in code level in this kind of workflow:
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
#define model
#compile model
#fit the model
TF_CONFIG
environment variable is used in virtual machines participating to distributed job. Configure it for custom distribution strategies. For example each neural network layer can be ran on parallel (a model-parallel approach).
Gradients
tf.GradientTape
is required for gradient calculation in eager execution.
Gradient tape might also be helpful when computing integrated gradients for feature importances.
Speed optimizations in Tensorflow
Pre-fetching
In an iteration of multi step flow, start preparing next dataset while the previous one is being processed.
Multithreading
The common programming paradigm is also available in Tensorflow.
Post-training quantization
Models can be optimized after training. Post-training quantization is recommended to decrease serving latency. Model size is also reduced. Quantization means converting to lower precision, eg 32 bit floats to 8 bit integers.
My understanding is that this requires basicly no code changes.
Reduced precision
Lower precision floating point numbers decrease training curve convergence time while keeping the same accuracy.
Tensowflow Lite
Lighter version of Tensorflow to run in devices like phones.
It sacrifices some computational precision for edge portability.
For example Android developers have an inference library to make predictions in mobile apps.
Image processing in Tensorflow
tf.image
API provides functionalities to resize images, padding for convolution, draw bounding boxes. You can also adjust brightness, contrast and make the image grayscale.
This toolkit is useful for both data pre-processing and augmentation.
TensorFlow Enterprise
Tensorflow Enterprise is a commercial version of the open-source core product. The Enterprise framework is targeted for large customers in Google Cloud and is tied to the free version but has additional capabilities.
In Google Cloud Tensorflow Enterprise is integrated to Deep learning VM images, Deep learning containers, Notebooks and Vertex AI training.
Help for engineering problems available.
Logging and debugging in Tensorflow
TensorBoard Debugger V2 is a convenient way to log and debug execution information.
Set Tensorflow logging level by TF_CPP_MIN_LOG_LEVEL
environment variable.
Tensorflow libraries
Multiple libraries extend Tensorflow. They are all under the official Tensorflow GitHub account .
Tensorflow library | Description | Library name in PIP |
---|---|---|
Tensorboard | Visualize ML experimentations. Training metric, execution grpah, hardware etc. Not for inference. | tensorboard |
Tensorflow Profiler | Tracks performance of the models. Understand CPU and GPU resources consumption in Tensorflow operations. | tensorboard-plugin-profile (requires Tensorboard) |
Tensorflow Probability | Combine probabilistic models to deep leaning and powerful hardware. | tensorflow-probability |
Tensorflow Ranking | Develop learning to rank (LTR) models. | tensorflow-ranking |
Tensorflow Datasets | Find ready to use datasets. | tensorflow-datasets |
Tensorflow Recommenders | Build recommender systems. | tensorflow-recommenders |
Tensorflow I/O | File systems and formats not available in core Tensorflow. Eg Parquet. | tensorflow-io |
Tensorflow Extended
Tensorflow Extended aka TFX is a framework on top of Tensorflow for data pre-processing, model operationalitzation and deployment.
Write a new comment
The name will be visible. Email will not be published. More about privacy.