Quick Machine Learning Workflow in Python, with KNN as Example of Ionosphere Data

Multiple approaches to build models of machine learning in Python are possible, and the article would serve as a simply summary of the essential steps to conduct machine learning from data loading to final visualization. You can find the data here: http://archive.ics.uci.edu/ml/datasets/Ionosphere More details can be found in Robert Layton's book here: https://www.goodreads.com/book/show/26019855-learning-data-mining-with-python?from_search=true

RNN in TensorFlow in Python&R, with MNIST

Though it is more convenient to conduct TensorFlow framework in python, we also talked about how to apply Tensorflow in R here:https://charleshsliao.wordpress.com/tag/tensorflow/ We will talk about how to apply Recurrent neural network in TensorFlow on both of python and R. RNN might not be the best algorithm to deal with MNIST but this can be… Continue reading RNN in TensorFlow in Python&R, with MNIST

Comparative Visualization of IBM&Google

I want to create an infographic using data provided (related to Google and IBM). To create a comparative visualization, enabling the reader to have an experience with this dataset. When developing an infographic, let the data flush out the concept rather then work up a concept and try to force the chart into an idea… Continue reading Comparative Visualization of IBM&Google

Quick Cross Validation and Grid Search of Parameters in Python

Cross Validation is a way to lift overfitting during training model, and we also applied Grid Search method in both python and R: https://charleshsliao.wordpress.com/2017/05/20/logistic-regression-in-python-to-tune-parameter-c/ https://charleshsliao.wordpress.com/2017/04/24/cnndnn-of-keras-in-r-backend-tensorflow-for-mnist/ We will focus on how to use both the methods to identify the best parameters, model and score without overfitting. We use the data from sklearn library, and the IDE… Continue reading Quick Cross Validation and Grid Search of Parameters in Python

Features Selection in Python

We talked about features selection based on Lasso(https://charleshsliao.wordpress.com/2017/04/11/regularization-in-neural-network-with-mnist-and-deepnet-of-r/), and autoencoder. More features will make the model more complex. it can be a good idea to reduce the number of features to only the most useful ones, and discard the rest. There are three basic strategies: Univariate statistics, model-based selection and iterative selection. We use the… Continue reading Features Selection in Python

Clustering Application in Face Recognition in Python

We used face datasets for PCA application here: https://charleshsliao.wordpress.com/2017/05/28/preprocess-pca-application-in-python/ It also will be interesting to see how clustering algorithms assign images into different clusters and visualize them. We use the data from sklearn library(need to download face datasets separately), and the IDE is sublime text3. Most of the code comes from the book: https://www.goodreads.com/book/show/32439431-introduction-to-machine-learning-with-python?from_search=true

Clustering Algorithms Evaluation in Python

Sometimes we conduct clustering to match the clusters with the true labels of the dataset. Apparently this is one method to evaluate clustering results. We can also use other methods to complete the task with or without ground truth of the data. We use the data from sklearn library, and the IDE is sublime text3.… Continue reading Clustering Algorithms Evaluation in Python

DBSCAN in Python

Another very useful clustering algorithm is DBSCAN (which stands for “Density- based spatial clustering of applications with noise”). The main benefits of DBSCAN are that ###a) it does not require the user to set the number of clusters a priori, ###b) it can capture clusters of complex shapes, and ###c) it can identify point that… Continue reading DBSCAN in Python