It would be interesting to conduct prediction to understand the trend of NBA winning teams. We will use data from http://www.basketball-reference.com/leagues/NBA_2017_games-june.html and follow workflow. More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

# Tag: Classification

## ROC and Confusion Matrix for Classifier in Python

We use the data from sklearn library(need to download face datasets separately), and the IDE is sublime text3. Most of the code comes from the book: https://www.goodreads.com/book/show/32439431-introduction-to-machine-learning-with-python?from_search=true

## How Certain is This Classifier? Uncertainty Estimates in Python

We are not only interested in which class a classifier predicts for a certain test point, but also how certain it is that this is the right class.There are two different functions revealing the certainty of the classifier. We use the data from sklearn library, and the IDE is sublime text3. Most of the code… Continue reading How Certain is This Classifier? Uncertainty Estimates in Python

## Logistic Regression in Python to Tune Parameter C

The trade-off parameter of logistic regression that determines the strength of the regularization is called C, and higher values of C correspond to less regularization (where we can specify the regularization function).C is actually the Inverse of regularization strength(lambda) We use the data from sklearn library, and the IDE is sublime text3. Most of the… Continue reading Logistic Regression in Python to Tune Parameter C

## Quick KNN Examples in Python

Walked through two basic knn models in python to be more familiar with modeling and machine learning in python, using sublime text 3 as IDE. The first example of knn in python takes advantage of the iris data from sklearn lib. The second example takes data of breast cancer from sklearn lib. We evaluate the… Continue reading Quick KNN Examples in Python

## Auto Encoder to Detect Anomalous Cases in Smartphone Actimetry Data

We use a deep auto-encoder model to analyze actimetry data from smartphones. You can find the data here: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones. Why should we do this? An auto encoder can be useful for excluding unknown or unusual activities, rather than incorrectly classifying them, by examining whether any of the activities tend to have more or less anomalous values. We… Continue reading Auto Encoder to Detect Anomalous Cases in Smartphone Actimetry Data

## Regularization in Neural Network, with MNIST and Deepnet of R

Several regularization methods are helpful to reduce overfitting of nn model. 1. L1 penalty is also known as the Least Absolute Shrinkage and Selection Operator (lasso). The penalty term uses the sum of the absolute weights, so the degree of penalty is no smaller or larger for small or large weights People are more familiar… Continue reading Regularization in Neural Network, with MNIST and Deepnet of R

## Tune Multi-layer Perceptron (MLP) in R with MNIST

Googled MLP and so many "My Little Ponies" results popped out. LOL. 🙂 Generally speaking, a deep learning model means a neural network model with with more than just one hidden layer. Whether a deep learning model would be successful depends largely on the parameters tuned. Multi-layer Perceptron or MLP provided by R package "RNNS"… Continue reading Tune Multi-layer Perceptron (MLP) in R with MNIST

## Bank Loan Estimation with SVM and Logistic Regression

Use the bank marketing dataset from UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Bank+Marketing). There are no the only best C or Gamma value for SVM since the data and the problem we try to solve are different. From the observation above, a higher gamma value would result a slightly better accuracy. But the cost would not… Continue reading Bank Loan Estimation with SVM and Logistic Regression

## Some Classifiers Comparison for Handwriting Digits Example

Data and background: https://charleshsliao.wordpress.com/2017/02/24/svm-tuning-based-on-mnist/ In this project, we used different classifiers to examine the dataset. The algorithms we have explored in our experiments are: K-Nearest Neighbors algorithm(KNN) Support Vector Machine algorithm(SVM) Fast Nearest Neighbor algorithm(FNN) Naive Bayes algorithm(NBs) Logistic Regression(Rpart). We compared the results for each algorithm, and discussed the advantages and disadvantages of them… Continue reading Some Classifiers Comparison for Handwriting Digits Example