Zoo Animal High Class clustering ¶

Goal :

Predict animal high class using Agglomerative clustering
Calcultate absolute-error

Data Exploration ¶

import numpy as np, pandas as pd, matplotlib.pyplot as plt, seaborn as sns

df = pd.read_csv('zoo.csv')
df.head()

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 101 entries, 0 to 100
Data columns (total 18 columns):
animal_name    101 non-null object
hair           101 non-null int64
feathers       101 non-null int64
eggs           101 non-null int64
milk           101 non-null int64
airborne       101 non-null int64
aquatic        101 non-null int64
predator       101 non-null int64
toothed        101 non-null int64
backbone       101 non-null int64
breathes       101 non-null int64
venomous       101 non-null int64
fins           101 non-null int64
legs           101 non-null int64
tail           101 non-null int64
domestic       101 non-null int64
catsize        101 non-null int64
class_type     101 non-null int64
dtypes: int64(17), object(1)
memory usage: 14.3+ KB

No missing values

Checking Distribution

df.groupby('class_type').size()

class_type
1    41
2    20
3     5
4    13
5     4
6     8
7    10
dtype: int64

sns.countplot(df.class_type)

<matplotlib.axes._subplots.AxesSubplot at 0x22e247d45c8>

# getting X anf y
X = df.loc[:,'hair':'catsize']
y = df.class_type -1

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

Modeling and Prediction ¶

With fine tuning

# using n_cluster = 7
from sklearn.cluster import AgglomerativeClustering
k = 7
agglo = AgglomerativeClustering(n_clusters=k,affinity='euclidean',linkage='average')
y_pred = agglo.fit_predict(X)

from sklearn.metrics import mean_squared_error
# measure mean square error
np.sqrt(mean_squared_error(y, y_pred))

2.0990332522519517

Billy Gustave

Zoo Animal High Class clustering

Zoo Animal High Class clustering ¶

Data Exploration ¶

Modeling and Prediction ¶

Contact Me

www.linkedin.com/in/billygustave

billygustave.com

Billy Gustave

	animal_name	hair	eggs	milk	aquatic	predator	toothed	backbone	breathes	fins	legs	tail	catsize	class_type
0	aardvark	1	0	1	0	1	1	1	1	0	4	0	1	1
1	antelope	1	0	1	0	0	1	1	1	0	4	1	1	1
2	bass	0	1	0	1	1	1	1	0	1	0	1	0	4
3	bear	1	0	1	0	1	1	1	1	0	4	0	1	1
4	boar	1	0	1	0	1	1	1	1	0	4	1	1	1