"Affinity analysis can be applied to many processes that do not use transactions in this sense: Fraud detection Customer segmentation Software optimization Product recommendations. The classic algorithm for affinity analysis is called the Apriori algorithm. " More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true We explored similar method of "Market Basket" here:… Continue reading Movie Recommender -Affinity Analysis of Apriori in Python

# Tag: Application

## NBA Winning Estimator with Decision Tree in Python

It would be interesting to conduct prediction to understand the trend of NBA winning teams. We will use data from http://www.basketball-reference.com/leagues/NBA_2017_games-june.html and follow workflow. More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

## Quick Machine Learning Workflow in Python, with KNN as Example of Ionosphere Data

Multiple approaches to build models of machine learning in Python are possible, and the article would serve as a simply summary of the essential steps to conduct machine learning from data loading to final visualization. You can find the data here: http://archive.ics.uci.edu/ml/datasets/Ionosphere More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

## Comparative Visualization of IBM&Google

I want to create an infographic using data provided (related to Google and IBM). To create a comparative visualization, enabling the reader to have an experience with this dataset. When developing an infographic, let the data flush out the concept rather then work up a concept and try to force the chart into an idea… Continue reading Comparative Visualization of IBM&Google

## Features Selection in Python

We talked about features selection based on Lasso(https://charleshsliao.wordpress.com/2017/04/11/regularization-in-neural-network-with-mnist-and-deepnet-of-r/), and autoencoder. More features will make the model more complex. it can be a good idea to reduce the number of features to only the most useful ones, and discard the rest. There are three basic strategies: Univariate statistics, model-based selection and iterative selection. We use the… Continue reading Features Selection in Python

## Preprocess: PCA Application in Python

We use the data from sklearn library, and the IDE is sublime text3. Most of the code comes from the book: https://www.goodreads.com/book/show/32439431-introduction-to-machine-learning-with-python?from_search=true

## Recommenders in R, Comparing Multiple Algorithms

We know several essential recommenders' methods. If we want to recommend ourselves a book, we can do it 1. Based on our own exp 2. Based on our friends friends exp 3. Based on the catalog of the library 4. Based on the search engine's result We already talked a little about the first method… Continue reading Recommenders in R, Comparing Multiple Algorithms

## Credit Analysis with ROC evaluation in Neural Network and Random Forest

This is quite like the article using C5.0 to conduct classification: https://charleshsliao.wordpress.com/2017/03/04/a-quick-classification-example-with-c5-0-in-r/ We tried to use more mature and powerful algorithms with cross validation and parameters tuning. 1. At first we preprocess the data. 2. We can start with the basic logistic regression model. The ROC chart below shows the Average Under Curve value as a metric… Continue reading Credit Analysis with ROC evaluation in Neural Network and Random Forest

## Digital Marketing Application Method of Machine Learning and Data Mining, with RFM Model

No matter it is a classifier or a regression model, we apply the data mining and machine learning methods to achieve a target. To be more straightforward, we need to solve a problem. Especially in Digital Marketing (or “traditional marketing with data analytics approach”) when we focus on models of AARRR, PRAPA or ARM, in… Continue reading Digital Marketing Application Method of Machine Learning and Data Mining, with RFM Model

## A Quick Association Rules Example within R

Association rules are used to decided what items would lead to other items' purchase. The practice is commonly known as market basket analysis due to the fact that it has been so frequently applied to supermarket data. The dataset used here was adapted from the Groceries dataset in the arules R package.