CNN/DNN of KeRas in R, Backend Tensorflow, for MNIST

Keras is a library of tensorflow, and they are both developed under python. We can approach to both of the libraries in R after we install the according packages.

Of course, we need to install tensorflow and keras at first with terminal (I am using a MAC), and they can function best with python 2.7.

I am using the most of the code from https://cran.r-project.org/web/packages/kerasR/vignettes/introduction.html

but we might have to update the related libraries and packages manually, and there are some issues not covered by the tutorial above. The the script below is slightly different from the tutorial.

I am tracking FOM: use the least training data set to get the optimal result.

Sys.setenv(TENSORFLOW_PYTHON="/usr/local/bin/python")
# point to python 2.7 (self-installed, not the default one of OSX)
library(tensorflow)
library(kerasR)
######################################################
###quick example with Boston House###
######################################################
mod<-Sequential()

###1. set up layers and other parameters
mod$add(Dense(units=200,input_shape = 13))
mod$add(Activation("relu"))
mod$add(Dense(units=1))

###2. compile the model
keras_compile(mod,loss="mse",optimizer = RMSprop())

###3. load the example data of keras
boston<-load_boston_housing()
x_train<-scale(boston$X_train)
y_train<-boston$Y_train
x_test<-scale(boston$X_test)
y_test<-boston$Y_test

###4. train the model
keras_fit(mod,x_train,y_train,batch_size = 32,epochs=200,verbose = 1,validation_split = 0.1)

###5. predict
pred<-keras_predict(mod,normalize(x_test))
##It is generally very important to normalize
##the data matrix before fitting a neural network model in keras.
sd(as.numeric(pred)-y_test)/sd(y_test)
## [1] 0.7741542
######################################################
###DNN example with MNIST###
######################################################

###1. load data
mnist <- load_mnist()
x_train <- mnist$X_train
y_train <- mnist$Y_train
x_test <- mnist$X_test
y_test <- mnist$Y_test
dim(x_train)
## [1] 60000    28    28
######################################################
####change the size of data used for training here####
######################################################
n<-5000
sample_n<-sample(60000,n)
x_train <- array(x_train, dim = c(dim(x_train)[1], prod(dim(x_train)[-1]))) / 255
x_train<-x_train[sample_n,]
y_train<-y_train[sample_n]
x_test <- array(x_test, dim = c(dim(x_test)[1], prod(dim(x_test)[-1]))) / 255

###2. set up main parameters
mod<-Sequential()
mod$add(Dense(units=512,input_shape = dim(x_train)[2]))
mod$add(LeakyReLU())
mod$add(Dropout(0.25))

mod$add(Dense(units=512))
mod$add(LeakyReLU())
mod$add(Dropout(0.25))

mod$add(Dense(units=512))
mod$add(LeakyReLU())
mod$add(Dropout(0.25))

mod$add(Dense(units=10))
mod$add(Activation("softmax"))

###3. fit the model
keras_compile(mod,loss="sparse_categorical_crossentropy", optimizer = RMSprop())
keras_fit(mod,x_train,y_train,batch_size = 32,epochs = 30,
          verbose = 1, validation_split = 0.2)

###4. predict
y_test_hat<-keras_predict_classes(mod,x_test)
library(caret)
## Loading required package: lattice
## Loading required package: ggplot2
caret::confusionMatrix(y_test,y_test_hat)$overall
##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull
##      0.9329000      0.9254201      0.9278190      0.9377271      0.1131000
## AccuracyPValue  McnemarPValue
##      0.0000000            NaN
acc<-caret::confusionMatrix(y_test,y_test_hat)$overall[1]
error_rate_keras_DNN<-1-acc
paste0("error rate of Keras DNN: ",error_rate_keras_DNN)
## [1] "error rate of Keras DNN: 0.0671"
FOM_Keras_DNN<-(n/60000)/2+error_rate_keras_DNN
paste0("FOM of Keras DNN: ",FOM_Keras_DNN)
## [1] "FOM of Keras DNN: 0.108766666666667"
######################################################
###CNN example with MNIST###
######################################################
mnist <- load_mnist()

n<-3800
sample_n<-sample(60000,n)

X_train <- array(mnist$X_train[sample_n,1:28,1:28], dim = c(dim(mnist$X_train[sample_n,1:28,1:28]), 1)) / 255
Y_train <- mnist$Y_train
Y_train<-Y_train[sample_n]
X_test <- array(mnist$X_test, dim = c(dim(mnist$X_test), 1)) / 255
Y_test <- mnist$Y_test

mod <- Sequential()

mod$add(Conv2D(filters = 32, kernel_size = c(3, 3),
               input_shape = c(28, 28, 1)))
mod$add(Activation("relu"))

mod$add(Conv2D(filters = 32, kernel_size = c(3, 3),
               input_shape = c(28, 28, 1)))
mod$add(Activation("relu"))

mod$add(MaxPooling2D(pool_size=c(2, 2)))
mod$add(Dropout(0.25))

mod$add(Flatten())

mod$add(Dense(128))
mod$add(Activation("relu"))
mod$add(Dropout(0.25))

mod$add(Dense(10))
mod$add(Activation("softmax"))

keras_compile(mod,loss = 'sparse_categorical_crossentropy', optimizer = RMSprop())
keras_fit(mod, X_train, Y_train, batch_size = 32, epochs = 5, verbose = 1,
          validation_split = 0.1)

Y_test_hat<-keras_predict_classes(mod,X_test)
library(caret)
caret::confusionMatrix(Y_test,Y_test_hat)$overall
##       Accuracy          Kappa  AccuracyLower  AccuracyUpper   AccuracyNull
##      0.9682000      0.9646550      0.9645717      0.9715522      0.1145000
## AccuracyPValue  McnemarPValue
##      0.0000000            NaN
acc_cnn<-caret::confusionMatrix(Y_test,Y_test_hat)$overall[1]
error_rate_keras_CNN<-1-acc_cnn
paste0("error rate of Keras CNN: ",error_rate_keras_CNN)
## [1] "error rate of Keras CNN: 0.0318000000000001"
FOM_Keras_CNN<-(n/60000)/2+error_rate_keras_CNN
paste0("FOM of Keras CNN: ",FOM_Keras_CNN)
## [1] "FOM of Keras CNN: 0.0634666666666667"

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s