Identify Arguments of H2O Deep Learning Model with Tuned Auto Encoder in R with MNIST

Auto-encode can be trained to learn the deep or hidden features of data. These hidden features may be used on their own, such as to better understand the structure of data, or for other applications.
Two common applications of auto-encoders and unsupervised learning are to identify anomalous data (for example, outlier detection, financial fraud)
and to pre-train more complex, often supervised, models such as deep neural networks.

The quick example below showed how to pick the number of hidden neurons or the number of layers.

################################################################
###              tune autoencoder with MNIST                 ###
################################################################
library(jsonlite)
library(caret)
library(h2o)
library(ggplot2)
library(data.table)
load_image_file <- function(filename) {
  ret = list()
  f = file(filename,'rb')
  readBin(f,'integer',n=1,size=4,endian='big')
  ret$n = readBin(f,'integer',n=1,size=4,endian='big')
  nrow = readBin(f,'integer',n=1,size=4,endian='big')
  ncol = readBin(f,'integer',n=1,size=4,endian='big')
  x = readBin(f,'integer',n=ret$n*nrow*ncol,size=1,signed=F)
  ret$x = matrix(x, ncol=nrow*ncol, byrow=T)
  close(f)
  ret
}
load_label_file <- function(filename) { 
  f = file(filename,'rb')
  readBin(f,'integer',n=1,size=4,endian='big')
  n = readBin(f,'integer',n=1,size=4,endian='big')
  y = readBin(f,'integer',n=n,size=1,signed=F)
  close(f)
  y
}
imagetraining<-as.data.frame(load_image_file("train-images-idx3-ubyte"))
imagetest<-as.data.frame(load_image_file("t10k-images-idx3-ubyte"))
labeltraining<-as.factor(load_label_file("train-labels-idx1-ubyte"))
labeltest<-as.factor(load_label_file("t10k-labels-idx1-ubyte"))
imagetraining[,1]<-labeltraining
imagetest[,1]<-labeltest
Training<-imagetraining
Test<-imagetest 
sample_n<-10000
training<-Training[sample(60000,sample_n),]
test_x<-Test[,-1]

cl<-h2o.init(max_mem_size = "20G",nthreads = 10)
h2odigits<-as.h2o(training, destination_frame = "h2odogits")
h2odigits_t<-as.h2o(Test, destination_frame = "h2odogits_t")
h2odigits_train_x<-h2odigits[,-1]
h2odigits_test_x<-h2odigits_t[,-1]
xnames<-colnames(h2odigits_train_x)

folds<-createFolds(1:5000,k=5)
###create arguments to tune###
ae_params<-list(
  list(hidden=c(50),input_dro=c(0),hidden_dro=c(0)),
  list(hidden=c(200),input_dro=c(0.2),hidden_dro=c(0)),
  list(hidden=c(400),input_dro=c(0.2),hidden_dro=c(0)),
  list(hidden=c(200),input_dro=c(0.2),hidden_dro=c(0.5)),
  list(hidden=c(400,200),input_dro=c(0.2),hidden_dro=c(0.25,0.25)),
  list(hidden=c(400,200),input_dro=c(0.2),hidden_dro=c(0.5,0.25))
)
i<-1:5000
ae_digits<-lapply(ae_params,function(x){
  lapply(folds, function(i){
    h2o.deeplearning(x=xnames,training_frame = h2odigits_train_x[-i,],
    validation_frame = h2odigits_train_x[i,],
    activation="TanhWithDropout",autoencoder = T,hidden = x$hidden,
    epochs = 30,sparsity_beta = 0,input_dropout_ratio = x$input_dro,
    hidden_dropout_ratios = x$hidden_dro, l1=0,l2=0)
  })
})

ae_results<-lapply(ae_digits,function(m){
  sapply(m,h2o.mse,valid=T)
})
ae_result<-data.table(Model=rep(paste0("M",1:6),each=5),
                       MSE=unlist(ae_results))
###track the combination of arguments of smallest MSE###
ae_result[,.(Mean_MSE=mean(MSE)),by=Model][order(Mean_MSE)]

#Model    Mean_MSE
#1:    M3 0.017463833
#2:    M2 0.017972044
#3:    M1 0.025295072
#4:    M4 0.049736584
#5:    M6 0.056566713
#6:    M5 0.064480589
##  ae_params[[4]]: list(hidden=c(200),input_dro=c(0.2),hidden_dro=c(0.5))

###train the model with ae_params[[4]]###
ae_digits_final<-h2o.deeplearning(
  x=1:5000,y=1,training_frame = h2odigits,
  activation="TanhWithDropout",hidden = ae_params[[4]]$hidden,
  epochs = 30,sparsity_beta = 0,input_dropout_ratio = ae_params[[4]]$input_dro,
  hidden_dropout_ratios = ae_params[[4]]$hidden_dro, l1=0,l2=0)
### predict the result###
ae_predict_results<-h2o.predict(ae_digits_final,h2odigits_t)
digits_id<-as.numeric(seq(1:10000))
names(digits_id)[1]<-"ID"
predict_results<-as.data.frame(ae_predict_results[,1])
caret::confusionMatrix(predict_results$Label,Test$n)$overall
##Accuracy          Kappa  AccuracyLower  AccuracyUpper 
##0.92210000     0.91340777     0.91667314     0.92727949 
##AccuracyNull AccuracyPValue  McnemarPValue 
##0.11350000     0.00000000            NaN 

The h2o tuning process might take significant time.

At the end of the code we can see clearly based on the arguments tuned in auto encoder we can decide a part of the parameters to use in the FNN model. Actually if we have enough time and computation resources we can decide all the parameters to use for training the FNN model, such as training data set, epoch number, hidden layers number, and neuron number, activation functions, regularization methods and value, learning rate and so many on.

Advertisements

1 thought on “Identify Arguments of H2O Deep Learning Model with Tuned Auto Encoder in R with MNIST”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s