Using Deeplearning4J, you can create convolutional neural networks, also referred to as CNNs or ConvNets, in just a few lines of code. If you don’t know what a CNN is, for now, just think of it as a feed-forward neural network that is optimized for tasks such as image classification and natural language processing.
In this short tutorial, I’m going to show you how to create a simple CNN and train it using the CIFAR-10 dataset, a very popular dataset that has thousands of labeled images.
To follow along, you’ll need:
Fire up IntelliJ IDEA and create a new Maven project using the
quickstart archetype. Once the project has been generated, open the pom.xml file and add the following inside the
As you can see, we’ll be using DL4J 0.7.2 in this tutorial.
At this point, the project setup is complete.
You don’t have to manually download the CIFAR-10 dataset for this tutorial. Instead you can use a class called
CifarDataSetIterator, which automatically downloads the dataset using the DataVec library. So, inside the
main() function of the App.java file, add the following code:
Note that, for now, we are using only 5000 images from the dataset. Feel free to change that number.
If you want to list all the labels present in the dataset, you can use the following code:
At this point, if you compile and run your project, you should see the following output:
It’s now time to start creating the individual layers of our neural network. We’re going to have the following layers:
We’re going to order the layers such that each convolution layer is immediately followed by a subsampling layer. The output layer will, of course, be the last layer and it will have 10 neurons, to represent the 10 labels of our dataset.
Accordingly, add the following code to your file:
Note that you are free to choose other values for the kernel sizes, the strides, and the paddings. But, I suggest you use
RELU as the activation function for all the convolution layers, and
SOFTMAX for the output layer. As for the subsampling layers,
MAX is the most often used pooling type.
Also make sure you always pass
3 to the first convolution layer’s
nIn() method. This is important because all our input images have 3 channels.
We must now create a
MultiLayerConfiguration object specifying the configuration details of our neural network. Using it, we can also arrange our layers in the correct order.
Note that you are again free to experiment with different values for the learning rate, l2, momentum, and optimization algorithms. Another important thing to note in the above code is the call to the
setInputType() method, which specifies that our neural network’s input type is convolutional, with 32x32 images having 3 colors.
Finally, you can create the neural network by passing the
configuration object to the constructor of the
MultiLayerNetwork class. Once created, the network must be initialized by calling its
To start training the convolutional neural network you just created, just call its
fit() method and pass the iterator object to it.
Once the training is complete, you can evaluate your network by calling its
evaluate() method. I suggest you pass a new
CifarDataSetIterator object to it, this time using the test data only. The method returns an
Evaluation object. By calling its
stats() method, you can get a detailed report of your network’s performance:
Go ahead and run the project now. You’ll, of course, have to wait for several minutes for the training to complete.
After tinkering with the network’s parameters for about an hour or so, I’ve managed to achieve an accuracy of about 60%. If you put in more effort, and are patient enough for longer training durations, I’m sure you can achieve much higher accuracies. The best convnets out there have achieved almost 95% accuracy.