Quick KNN Examples in Python

Walked through two basic knn models in python to be more familiar with modeling and machine learning in python, using sublime text 3 as IDE.

The first example of knn in python takes advantage of the iris data from sklearn lib.


###1. import data
from sklearn.datasets import load_iris
iris=load_iris()
print(iris.keys())
print('\n''x:',iris['feature_names'])
print('\n''y:',iris['target_names'])
print('\n''type of data:',type(iris['data']))

import pandas as pd 
def rstr(df):
	return df.shape, df.apply(lambda x:[x.unique()])
print('\n''structure of data:''\n',
	rstr(pd.DataFrame(iris['data'])))

###2.split data
#####Scikit-learn contains a function that (75% for training )
#####shuffles the dataset and splits it for you, the train_test_split function.
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(iris['data'],iris['target'],random_state=0)

###3.look at the data
import matplotlib.pyplot as plt 
import numpy as np 

fig,ax=plt.subplots(3,3,figsize=(15,15))
plt.suptitle('Iris Pair Plot')
for i in range(3):
	for j in range(3):
		ax[i,j].scatter(X_train[:,j],X_train[:,i+1],c=y_train,s=6)
		ax[i, j].set_xticks(())
		ax[i, j].set_yticks(())
		if i==2:
			ax[i,j].set_xlabel(iris['feature_names'][j])
		if j==0:
			ax[i,j].set_ylabel(iris['feature_names'][i+1])
		if i>j:
			ax[i,j].set_visible(False)
plt.show()

Screen Shot 2017-05-18 at 2.18.51 PM.png

###4. build a KNN model to classify 
###All machine learning models in scikit-learn are implemented in their own class, 
###which are called Estimator classes. The k nearest neighbors classification algorithm 
###is implemented in the KNeighborsClassifier class in the neighbors module.
###Before we can use the model, we need to instantiate the class into an object. 
###This is when we will set any parameters of the model. The single parameter of the 
###KNeighbor sClassifier is the number of neighbors, which we will set to one
from sklearn.neighbors import KNeighborsClassifier
knnmodel=KNeighborsClassifier(n_neighbors=1)
knnmodel.fit(X_train,y_train)
KNeighborsClassifier(algorithm='auto',leaf_size=30,metric='minkowski',
	metric_params=None,n_jobs=1,n_neighbors=1,p=2,weights='uniform')

###5. predict
y_pred=knnmodel.predict(X_test)
print('\n''accuracy:',np.mean(y_pred==y_test))
###use the score method of the knn object, which will compute the test set accuracy for us
print('\n''accuracy:',knnmodel.score(X_test,y_test))
###accuracy: 0.973684210526
###accuracy: 0.973684210526

The second example takes data of breast cancer from sklearn lib. We evaluate the different accuracy with different k.

###1. import data
from sklearn.datasets import load_breast_cancer
cancer=load_breast_cancer()

import pandas as pd 
def rstr(df):
	return df.shape, df.apply(lambda x:[x.unique()])
print('\n''structure of data:''\n',
	rstr(pd.DataFrame(cancer['data'])))

###2.split data
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(
	cancer.data,cancer.target,random_state=66)

###3.set up multiple k
training_accuracy=[]
test_accuracy=[]
neighbors_settings=range(1,10)

###4. build the KNN model with different k and evaluate 
from sklearn.neighbors import KNeighborsClassifier
for n in neighbors_settings:
	clf=KNeighborsClassifier(n_neighbors=n)
	clf.fit(X_train,y_train)
	training_accuracy.append(clf.score(X_train,y_train))
	test_accuracy.append(clf.score(X_test,y_test))

import matplotlib.pyplot as plt 
plt.plot(neighbors_settings,training_accuracy,label='training_accuracy')
plt.plot(neighbors_settings,test_accuracy,label='test accuracy')
plt.legend()
plt.show()

Screen Shot 2017-05-18 at 2.20.59 PM.png

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s