Cross Validation is a way to lift overfitting during training model, and we also applied Grid Search method in both python and R: https://charleshsliao.wordpress.com/2017/05/20/logistic-regression-in-python-to-tune-parameter-c/ https://charleshsliao.wordpress.com/2017/04/24/cnndnn-of-keras-in-r-backend-tensorflow-for-mnist/ We will focus on how to use both the methods to identify the best parameters, model and score without overfitting. We use the data from sklearn library, and the IDE… Continue reading Quick Cross Validation and Grid Search of Parameters in Python
This is quite like the article using C5.0 to conduct classification: https://charleshsliao.wordpress.com/2017/03/04/a-quick-classification-example-with-c5-0-in-r/ We tried to use more mature and powerful algorithms with cross validation and parameters tuning. 1. At first we preprocess the data. 2. We can start with the basic logistic regression model. The ROC chart below shows the Average Under Curve value as a metric… Continue reading Credit Analysis with ROC evaluation in Neural Network and Random Forest
Ensemble methods help improve performance of different models with methods of bagging, boosting, random forests. We use credit data from: https://charleshsliao.wordpress.com/2017/03/04/a-quick-classification-example-with-c5-0-in-r/ The Caret package here is powerful and enable us to consolidate cross validation and training process together with flexibility to tune the model specifically. Without boosting, the original accuracy of C5.0 model above is 0.68.
This article is still about SVM and related parameters, especially the one called Kernel. We can use different Kernel methods to project or map data into higher dimension space. This would be typically useful for non-linear problems in real life. The linear kernel does not transform the data at all The polynomial kernel of degree… Continue reading Kernels, SVM and a Letter Recognition Example
Background: Handwriting recognition is a well-studied subject in computer vision and has found wide applications in our daily life (such as USPS mail sorting). In this project, we will explore various machine learning techniques for recognizing handwriting digits. The dataset you will be using is the well-known MINST dataset. (1) The MNIST database of handwritten… Continue reading SVM(e1071 of R) Tuning with MNIST