Posit AI Blog: Keras for R
We are excited to announce that the keras package is now available on CRAN. The package provides an R interface to Keras, a highlevel neural networks API developed with a focus on enabling fast experimentation. Keras has the following key features:

Allows the same code to run on CPU or on GPU, seamlessly.

Userfriendly API which makes it easy to quickly prototype deep learning models.

Builtin support for convolutional networks (for computer vision), recurrent networks (for sequence processing), and any combination of both.

Supports arbitrary network architectures: multiinput or multioutput models, layer sharing, model sharing, etc. This means that Keras is appropriate for building essentially any deep learning model, from a memory network to a neural Turing machine.

Is capable of running on top of multiple backends including TensorFlow, CNTK, or Theano.
If you are already familiar with Keras and want to jump right in, check out https://tensorflow.rstudio.com/keras which has everything you need to get started including over 20 complete examples to learn from.
To learn a bit more about Keras and why we’re so excited to announce the Keras interface for R, read on!
Keras and Deep Learning
Interest in deep learning has been accelerating rapidly over the past few years, and several deep learning frameworks have emerged over the same time frame. Of all the available frameworks, Keras has stood out for its productivity, flexibility and userfriendly API. At the same time, TensorFlow has emerged as a nextgeneration machine learning platform that is both extremely flexible and wellsuited to production deployment.
Not surprisingly, Keras and TensorFlow have of late been pulling away from other deep learning frameworks:
Google web search interest around deep learning frameworks over time. If you remember Q4 2015 and Q12 2016 as confusing, you weren’t alone. pic.twitter.com/1f1VQVGr8n
— François Chollet (@fchollet) June 3, 2017
The good news about Keras and TensorFlow is that you don’t need to choose between them! The default backend for Keras is TensorFlow and Keras can be integrated seamlessly with TensorFlow workflows. There is also a pureTensorFlow implementation of Keras with deeper integration on the roadmap for later this year.
Keras and TensorFlow are the state of the art in deep learning tools and with the keras package you can now access both with a fluent R interface.
Getting Started
Installation
To begin, install the keras R package from CRAN as follows:
The Keras R interface uses the TensorFlow backend engine by default. To install both the core Keras library as well as the TensorFlow backend use the install_keras()
function:
This will provide you with default CPUbased installations of Keras and TensorFlow. If you want a more customized installation, e.g. if you want to take advantage of NVIDIA GPUs, see the documentation for install_keras()
.
MNIST Example
We can learn the basics of Keras by walking through a simple example: recognizing handwritten digits from the MNIST dataset. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:
The dataset also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.
Preparing the Data
The MNIST dataset is included with Keras and can be accessed using the dataset_mnist()
function. Here we load the dataset then create variables for our test and training data:
The x
data is a 3d array (images,width,height)
of grayscale values. To prepare the data for training we convert the 3d arrays into matrices by reshaping width and height into a single dimension (28×28 images are flattened into length 784 vectors). Then, we convert the grayscale values from integers ranging between 0 to 255 into floating point values ranging between 0 and 1:
The y
data is an integer vector with values ranging from 0 to 9. To prepare this data for training we onehot encode the vectors into binary class matrices using the Keras to_categorical()
function:
y_train < to_categorical(y_train, 10)
y_test < to_categorical(y_test, 10)
Defining the Model
The core data structure of Keras is a model, a way to organize layers. The simplest type of model is the sequential model, a linear stack of layers.
We begin by creating a sequential model and then adding layers using the pipe (%>%
) operator:
model < keras_model_sequential()
model %>%
layer_dense(units = 256, activation = "relu", input_shape = c(784)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 128, activation = "relu") %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = "softmax")
The input_shape
argument to the first layer specifies the shape of the input data (a length 784 numeric vector representing a grayscale image). The final layer outputs a length 10 numeric vector (probabilities for each digit) using a softmax activation function.
Use the summary()
function to print the details of the model:
Model
________________________________________________________________________________
Layer (type) Output Shape Param #
================================================================================
dense_1 (Dense) (None, 256) 200960
________________________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
________________________________________________________________________________
dense_2 (Dense) (None, 128) 32896
________________________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
________________________________________________________________________________
dense_3 (Dense) (None, 10) 1290
================================================================================
Total params: 235,146
Trainable params: 235,146
Nontrainable params: 0
________________________________________________________________________________
Next, compile the model with appropriate loss function, optimizer, and metrics:
model %>% compile(
loss = "categorical_crossentropy",
optimizer = optimizer_rmsprop(),
metrics = c("accuracy")
)
Training and Evaluation
Use the fit()
function to train the model for 30 epochs using batches of 128 images:
history < model %>% fit(
x_train, y_train,
epochs = 30, batch_size = 128,
validation_split = 0.2
)
The history
object returned by fit()
includes loss and accuracy metrics which we can plot:
Evaluate the model’s performance on the test data:
model %>% evaluate(x_test, y_test,verbose = 0)
$loss
[1] 0.1149
$acc
[1] 0.9807
Generate predictions on new data:
model %>% predict_classes(x_test)
[1] 7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7 1 2
[40] 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0 2 9 1 7 3 2
[79] 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9
[ reached getOption("max.print")  omitted 9900 entries ]
Keras provides a vocabulary for building deep learning models that is simple, elegant, and intuitive. Building a question answering system, an image classification model, a neural Turing machine, or any other model is just as straightforward.
The Guide to the Sequential Model article describes the basics of Keras sequential models in more depth.
Examples
Over 20 complete examples are available (special thanks to [@dfalbel](https://github.com/dfalbel) for his work on these!). The examples cover image classification, text generation with stacked LSTMs, questionanswering with memory networks, transfer learning, variational encoding, and more.
addition_rnn  Implementation of sequence to sequence learning for performing addition of two numbers (as strings). 
babi_memnn  Trains a memory network on the bAbI dataset for reading comprehension. 
babi_rnn  Trains a twobranch recurrent network on the bAbI dataset for reading comprehension. 
cifar10_cnn  Trains a simple deep CNN on the CIFAR10 small images dataset. 
conv_lstm  Demonstrates the use of a convolutional LSTM network. 
deep_dream  Deep Dreams in Keras. 
imdb_bidirectional_lstm  Trains a Bidirectional LSTM on the IMDB sentiment classification task. 
imdb_cnn  Demonstrates the use of Convolution1D for text classification. 
imdb_cnn_lstm  Trains a convolutional stack followed by a recurrent stack network on the IMDB sentiment classification task. 
imdb_fasttext  Trains a FastText model on the IMDB sentiment classification task. 
imdb_lstm  Trains a LSTM on the IMDB sentiment classification task. 
lstm_text_generation  Generates text from Nietzsche’s writings. 
mnist_acgan  Implementation of ACGAN (Auxiliary Classifier GAN ) on the MNIST dataset 
mnist_antirectifier  Demonstrates how to write custom layers for Keras 
mnist_cnn  Trains a simple convnet on the MNIST dataset. 
mnist_irnn  Reproduction of the IRNN experiment with pixelbypixel sequential MNIST in “A Simple Way to Initialize Recurrent Networks of Rectified Linear Units” by Le et al. 
mnist_mlp  Trains a simple deep multilayer perceptron on the MNIST dataset. 
mnist_hierarchical_rnn  Trains a Hierarchical RNN (HRNN) to classify MNIST digits. 
mnist_transfer_cnn  Transfer learning toy example. 
neural_style_transfer  Neural style transfer (generating an image with the same “content” as a base image, but with the “style” of a different picture). 
reuters_mlp  Trains and evaluates a simple MLP on the Reuters newswire topic classification task. 
stateful_lstm  Demonstrates how to use stateful RNNs to model long sequences efficiently. 
variational_autoencoder  Demonstrates how to build a variational autoencoder. 
variational_autoencoder_deconv  Demonstrates how to build a variational autoencoder with Keras using deconvolution layers. 
Learning More
After you’ve become familiar with the basics, these articles are a good next step:

Guide to the Sequential Model. The sequential model is a linear stack of layers and is the API most users should start with.

Guide to the Functional API. The Keras functional API is the way to go for defining complex models, such as multioutput models, directed acyclic graphs, or models with shared layers.

Training Visualization. There are a wide variety of tools available for visualizing training. These include plotting of training metrics, real time display of metrics within the RStudio IDE, and integration with the TensorBoard visualization tool included with TensorFlow.

Using PreTrained Models. Keras includes a number of deep learning models (Xception, VGG16, VGG19, ResNet50, InceptionVV3, and MobileNet) that are made available alongside pretrained weights. These models can be used for prediction, feature extraction, and finetuning.

Frequently Asked Questions. Covers many additional topics including streaming training data, saving models, training on GPUs, and more.
Keras provides a productive, highly flexible framework for developing deep learning models. We can’t wait to see what the R community will do with these tools!