Movie Recommender -Affinity Analysis of Apriori in Python

"Affinity analysis can be applied to many processes that do not use transactions in this sense: Fraud detection Customer segmentation Software optimization Product recommendations. The classic algorithm for affinity analysis is called the Apriori algorithm. " More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true We explored similar method of "Market Basket" here:… Continue reading Movie Recommender -Affinity Analysis of Apriori in Python

Advertisements

NBA Winning Estimator with Decision Tree in Python

It would be interesting to conduct prediction to understand the trend of NBA winning teams. We will use data from http://www.basketball-reference.com/leagues/NBA_2017_games-june.html and follow workflow. More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

Quick Machine Learning Workflow in Python, with KNN as Example of Ionosphere Data

Multiple approaches to build models of machine learning in Python are possible, and the article would serve as a simply summary of the essential steps to conduct machine learning from data loading to final visualization. You can find the data here: http://archive.ics.uci.edu/ml/datasets/Ionosphere More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

Comparative Visualization of IBM&Google

I want to create an infographic using data provided (related to Google and IBM). To create a comparative visualization, enabling the reader to have an experience with this dataset. When developing an infographic, let the data flush out the concept rather then work up a concept and try to force the chart into an idea… Continue reading Comparative Visualization of IBM&Google

Features Selection in Python

We talked about features selection based on Lasso(https://charleshsliao.wordpress.com/2017/04/11/regularization-in-neural-network-with-mnist-and-deepnet-of-r/), and autoencoder. More features will make the model more complex. it can be a good idea to reduce the number of features to only the most useful ones, and discard the rest. There are three basic strategies: Univariate statistics, model-based selection and iterative selection. We use the… Continue reading Features Selection in Python

Recommenders in R, Comparing Multiple Algorithms

We know several essential recommenders' methods. If we want to recommend ourselves a book, we can do it 1. Based on our own exp 2. Based on our friends friends exp 3. Based on the catalog of the library 4. Based on the search engine's result We already talked a little about the first method… Continue reading Recommenders in R, Comparing Multiple Algorithms

Credit Analysis with ROC evaluation in Neural Network and Random Forest

This is quite like the article using C5.0 to conduct classification: https://charleshsliao.wordpress.com/2017/03/04/a-quick-classification-example-with-c5-0-in-r/ We tried to use more mature and powerful algorithms with cross validation and parameters tuning. 1. At first we preprocess the data. 2. We can start with the basic logistic regression model. The ROC chart below shows the Average Under Curve value as a metric… Continue reading Credit Analysis with ROC evaluation in Neural Network and Random Forest

Digital Marketing Application Method of Machine Learning and Data Mining, with RFM Model

No matter it is a classifier or a regression model, we apply the data mining and machine learning methods to achieve a target. To be more straightforward, we need to solve a problem. Especially in Digital Marketing (or “traditional marketing with data analytics approach”) when we focus on models of AARRR, PRAPA or ARM, in… Continue reading Digital Marketing Application Method of Machine Learning and Data Mining, with RFM Model