TensorFlow 2.0 — Getting Started
As you might already know, TensorFlow 2.0 is in beta now. Like any other major upgrade, compared to 1.x, 2.0 has changed in many ways.
I’ve had the opportunity to play around with TensorFlow 2.0 from the alpha days. With the APIs being finalised now, this series is an attempt to share my experience so far.
( This is part 1 of the series. More parts are on the way. :-) )
Let’s start with the installation. If you did not get a chance to install TensorFlow 2.0 yet, here is how you could do it.
pip install tensorflow==2.0.0-beta1
No surprises there. (Above statement installs the CPU version. For instructions related to GPU version, refer here)
After the installation is complete, let’s put together our first program.
We’ll attempt to build a basic classification model using MNIST dataset. Later, in the subsequent parts of this series, we’ll build more advanced models.
Import statements are straight forward as you would expect.
First line is related to compatibility. The second one is for TensorFlow. Nothing fancy here.
Now, we need to load the data.
There are multiple ways to load the data in TensorFlow. We can use either tf.keras.datasets or tensorflow_datasets. Though I prefer to use tensorflow_datasets, for now, let’s use tf.keras.datasets. (We’ll go through tensorflow_datasets in another article.)
First two statements are to load the MNIST dataset. load_data() function returns image/label pairs for both training and test sets. Each image in this dataset is of size 28x28 pixels. The training set has 60000 images to train the model on and test set has 10000 images to evaluate the model against.
Once we have the data loaded, let’s also do feature-scaling by dividing the values with 255.
Build the model
It’s time to build the model now. For this example, we’ll use a very basic architecture.
Here is the code for that:
Let me explain what we are doing here.
At a basic level, we are going to stack all the layers in sequential way.
First, we’ll add a layer to flatten out the inputs. tf.keras.layers.Flatten layer takes (28,28) as the input shape argument, which is the same as the pixel size of input images. This layer, as the name suggests, allows us to flatten out the input image so that it can be used in the subsequent layers.
Next step is to add a Dense layer. We’ll do the same by using tf.keras.layers.Dense. Two inputs need to be provided: number of units (nodes) and the activation function. For now, let’s specify 128 as the number of units.
For activation function, there are multiple options available such as Linear, Sigmoid, Tanh, ReLU (Rectified Linear Unit), SELU (Scaled Exponential Linear Unit) etc. For now, let’s pick ReLU as the activation function. (To know more about other available options, refer here.)
As a regularization measure, let’s also specify drop out details by adding tf.keras.layers.Dropout layer with a value of 0.2.
Finally, let’s specify an output layer of Dense type with 10 nodes (which is same as the number of classes we are trying to predict) and softmax as the activation function.
Compile the model
Since our model structure is ready now, let’s proceed to compile the model.
In this step, we need to specify values for optimizer, loss function and metrics.
There are multiple options for the optimizer such as Adam, RMSProp, SGD etc. Let’s go with Adam (Adaptive moment estimation). (To know more about other available options, refer here.)
Similarly, for loss function, we could choose any of the options such as Binary Cross Entropy, Mean Squared Error, Categorical Cross Entropy, Spare Categorical Cross Entropy etc. We can pick Spare Categorical Cross Entropy for now. (To know more about other available options, refer here)
In order to monitor the model’s performance, we need to choose what metrics to watch out for. Here as well, there are multiple options: accuracy, precision, recall etc. Let’s use accuracy for now. (To know more about other available options, refer here)
Train the model
Next step is to train the model. That’s straight forward as follows:
Inputs for this step are the image data, labels and the number of epochs (iterations).
Now, let’s evaluate the model by passing test data set.
All steps together
To summarize: Here’s all the steps in one place:
Let’s go ahead and execute the program now.
As the program executes, you could see that we have an accuracy of 97.76%
When we evaluate using the test set, we reach an accuracy of 97.66%, which is not bad for a simple neural network architecture :-)
That’s all for now.
In this article, we’ve seen the building blocks of a basic TensorFlow program. In the upcoming parts of this series, we will go over more advanced use cases.
(For reference, you could also refer to my GitHub repo here to see the code we developed now.)
Please feel free to provide your valuable feedback. Till we meet again, happy coding!