Deep Learning on (Mobile) Clients
Deep Learning on (Mobile) Clients
Advantages and techniques for creating and training neural networks on mobile devices
by Johan Vos

Neural networks are increasingly being used in a myriad of applications. In the typical use case, all raw input data is sent to a server or a bunch of cloud instances, where the data is used to train and evaluate a model. That model is then used to make predictions based on specific input, and hands over the prediction to the interested party.

In this article, we will explore a complementary way of training the model by leveraging the power of client devices.

The Concept

Neural networks are based on the idea of a model that accepts some input (for example, pictures, numbers, characters, and so on) and produces some output that is relevant to the input (for example, "that was a picture of a dog," "the number was 3," "the next character is most likely the letter b," and so on), as shown in Figure 1.

Figure 1. Neural networks accept input and produce output that is relevant to the input.
Figure 1. Neural networks accept input and produce output that is relevant to the input.

The quality of a neural network partially depends on the quality of the data that is fed to it. When a specific input leads to a wrong output (Figure 2), the network can be retrained with the given data added to the set of training data.

Figure 2. If the output is incorrect (for example, rabbit), the neural network can be retrained (for example, to provide bird).
Figure 2. If the output is incorrect (for example, "rabbit"), the neural network can be retrained (for example, to provide "bird").

Different algorithms exist for training neural networks, and lots of research is currently being performed in this area. In this article, we will not focus on the training algorithms, but rather on what is required to perform training.

Typically, the neural network is trained in a server environment. The resulting model (the numerical constants that allow the algorithm to produce the "best" result based on the input) is used to make predictions and evaluations.

In many cases, though, most of the data is not available in the server environment. Pictures are taken on mobile phones, autonomous cars are using onboard cameras to generate media that has to be evaluated, and a keyboard is used to type data.

While the evaluation of the model can be done relatively easily on clients, training of the data is typically done on the server side. In order to improve the model, the training data is sent to the server. As a consequence, the picture you take with your phone, the scenery you drive by with your car, or the words you type on your smartphone have to be sent over the internet to the servers that are updating the neural network.

Clearly, this can lead to a number of privacy issues. Also, it requires lots of bandwidth, because tons of data has to be sent from clients to the server.

Distributed learning fixes a number of things. In this case, the local data is used to do local training on the model. As a consequence of this training, the model on the client improves, leading to different coefficients (internal numbers that make up the model). The gradient of the model can now be sent to the server, where it can be combined with the existing model and with gradients sent by other clients.

As a consequence, the quality of the model improves, and the enhanced model can be sent to the client (Figure 3).

Figure 3. The enhanced model can be sent back to the client.
Figure 3. The enhanced model can be sent back to the client.

There are a number of benefits to this approach:

  • Privacy: Sensitive data stays on the client device (or is even immediately removed) and is not sent to the server. The server has the relevant information, because the data is used to retrain the model, and that result is sent to the server.
  • Bandwidth: Rather than sending lots of data, only the changes in the model need to be sent. Those changes can be sent after each new sample or, for example, once per week.
  • Performance: By leveraging the power of mobile devices (and their CPU/GPU chips), at least some processing can be done locally. The power of single devices pales in comparison with the power of a server or a cluster, but when many devices are doing local training, that might significantly reduce the required calculations on the server for achieving the same result.
What Do We Need?

Many real-world scenarios where neural networks are useful involve mobile or embedded devices (for example, pictures on a phone, cameras in a car, and characters on a keyboard). Hence, it is important that the distributed learning techniques work on those devices.

A popular and open source library providing Java APIs for neural network operations is Eclipse Deeplearning4j (DL4J), which is supported by SkyMind.

In order to use the DL4J APIs on a client, that client should be capable of executing Java code. The DL4J APIs depend on native code for performance reasons; hence, Java Native Interface (JNI) needs to be supported.

The DL4J APIs work on mobile devices using the Gluon IDE plugins at The plugins provide an easy way to create cross-platform user interfaces based on JavaFX and to leverage existing libraries, including DL4J. In order to run on Android devices, the Gluon IDE plugins perform the required steps for creating an APK that can be uploaded to the Google Play Store. No Android-specific code is required, and the JavaFX code that is used to create a user interface on a desktop also works on Android devices.

Similarly, this code also works on iOS devices. The Gluon IDE plugins will invoke the Gluon VM tools, which will compile the Java code ahead of time to native code, link it with other native libraries, and create an iOS app that can be executed on an iPhone or an iPad, tested in the iOS Simulator, and uploaded to the Apple App Store.

As a consequence, the Java code that you use to create applications that leverage the DL4J neural network APIs is combined with a user interface created with JavaFX that runs on a desktop, an Android device, an iOS device, and also embedded devices. The important thing for developers is that your code is 100% cross-platform, because it is all written in Java. The hard part of translating that to the specific, native systems is done under the hood for you.

HelloWorld (Multilayer Perceptron Classifier)

As an example, let's explore a simple linear classifier that is trained locally. The code for this sample is available at

You can open the sample code in any IDE (NetBeans, IntelliJ, Eclipse), provided that you have the Gluon IDE plugin installed. Follow the instructions at in order to do this.

Once you open the sample in your IDE, you can quickly run it, and you will see the screen shown in Figure 4:

Figure 4. First screen displayed by the sample code.
Figure 4. First screen displayed by the sample code.

Clicking the train network model button will start the local training, and it will result in the screen shown in Figure 5:

Figure 5. Results of training the model.
Figure 5. Results of training the model.

You can also run this sample on an Android (Figure 6) or iOS device (Figure 7), if you have one connected to your system via USB. Depending on your IDE, you will see a task called androidInstall or launchIosDevice. This task will trigger the process for compiling the required dependencies, create the mobile app, and send it to your device. If you want to create apps that you want to upload to the App Store or the Play Store, you select the tasks createIPA or createAPK, respectively.

More information on this process can be found in the Gluon documentation at

Figure 6. Screenshot of the app on an Android device.
Figure 6. Screenshot of the app on an Android device.
Figure 7. Screenshot of the app on an iOS device.
Figure 7. Screenshot of the app on an iOS device.

The source code contains only two Java files. The file contains the code for creating the view, and the real work is done by the code in

The file contains code for training a neural network and for showing the output in a JavaFX UI.

The relevant JavaFX code is shown in the following snippet:

    private final Label label;

    public TrainingView(String name) {

        label = new Label();

        Button button = new Button("train network model");
        button.setOnAction(e -> {
            Task task = train();

        VBox controls = new VBox(15.0, label, button);


The user interface is defined by a VBox that contains a label and a button. The content of the label will be set by the training function, and it will indicate the current status of the training/evaluation.

Clicking the button triggers the training task. The training is performed in a JavaFX task in a dedicated thread. The JavaFX binding API is used to make sure the button is disabled as long as the training is in progress. Once the training task is done, the button is enabled again.

This sample is inspired by the screencast of DL4J at If you want to learn more about the concepts used in this code, check out that screencast and the related screencasts.

The basic idea of this sample is to take a pair of numbers between 0 and 1 as input and return an output that is either 0 or 1. The model is trained by using some well-known input/output pairs. The data that is used to train and test the model can be visualized as follows: the x and y axes show the input of a sample, the color red is applied for samples with an output of 0, and the color blue is applied for samples with an output of 1 (Figure 8).

Figure 8. Red indicates input samples that return an output of 0; blue indicates input samples that return an output of 1.
Figure 8. Red indicates input samples that return an output of 0; blue indicates input samples that return an output of 1.

Now, we will create a simple neural network that will be trained using some of the samples and then evaluated using another set of those samples.

When you click the train network model button, a number of things happen in sequentially.

First, training data is read from a supplied file. Note that in real-world cases, the test data is typically obtained by interactions with the user (for example, the user enters text or takes a picture).

     RecordReader rrTrain = new CSVRecordReader();
     DataSetIterator iterTrain = new RecordReaderDataSetIterator(rrTrain, batchSize, 0, 2);

In order to evaluate the model, we need evaluation data. This is achieved using the follow code:

    RecordReader rrEval = new CSVRecordReader();
    DataSetIterator iterEval = new RecordReaderDataSetIterator(rrEval, batchSize, 0, 2);

The following code snippet creates a neural network with two layers (one hidden layer). The input of the first layer contains two numbers that are the x and y value of the dots in Figure 8. The other layer contains two numbers as output, which contain the probabilities of the outcome either being 0 (red) or 1 (blue).

int numInputs = 2;
                int numHiddenNodes = 20;
                int numOutputs = 2;

                MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                        .updater(new Nesterovs.Builder()
                        .layer(0, new DenseLayer.Builder()
                        .layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)

Now that the model is defined, we can train it:

    MultiLayerNetwork network = new MultiLayerNetwork(conf);
  network.setListeners((IterationListener) (model, iteration, epoch) -> {
      Platform.runLater(() -> label.setText("Running iteration #" + iteration));
  Platform.runLater(() -> label.setText("training model..."));
  for (int n = 0; n < numEpochs; n++) {;

We will use a number of iterations for training the model, and whenever an iteration is done, this will be indicated in the JavaFX label that indicates the current state.

Because we run the training in a dedicated JavaFX task that runs in its own thread, we have to make sure we update the JavaFX Scene Graph by calling Platform.runLater().

Now that the model is trained, we can evaluate it to see how well it performs. This is achieved using the following snippet, which uses the test data created above:

    Platform.runLater(() -> label.setText("evaluating model..."));
    Evaluation evaluation = new Evaluation(numOutputs);
    while (iterEval.hasNext()) {
        DataSet dataSet =;
        INDArray features = dataSet.getFeatureMatrix();
        INDArray labels = dataSet.getLabels();
        INDArray predicted = network.output(features, false);
        evaluation.eval(labels, predicted);
    Platform.runLater(() -> label.setText("model evaluation result:\n" + evaluation.stats()));

At the end of the evaluation, we set content of the JavaFX label to the results of the evaluation, as shown in Figure 8.

Next Steps

This sample shows how you can create and train neural networks on mobile devices. This is only the basics of what you can do. In a follow-on article, I will show how you can share the model with the server, where you send gradient updates only and the server periodically sends enhanced versions of the model.

About the Author
Johan Vos started working with Java in 1995. He was part of the Blackdown team, porting Java to Linux. His main focus is on end-to-end Java, combining back-end systems and mobile/embedded devices. He received a Duke's Choice Award in 2014 for his work on JavaFX on mobile devices.

In 2015, he cofounded Gluon, which allows enterprises to create mobile Java client applications leveraging their existing back-end infrastructure. Gluon received a Duke's Choice Award in 2015. Vos is a Java Champion, a member of the BeJUG steering group and the Devoxx steering group, and he is a JCP member. He is the lead author of Pro JavaFX 8 (Apress, 2014), and he has been a speaker at numerous conferences on Java.
Experience Oracle Cloud —Get US$300 in free cloud credits.