概念介绍:
归一化是利用特征的最大最小值,将特征的值缩放到[new_min,new_max]区间,对于每一列的特征使用min-max函数进行缩放,计算公式如下
代码示例:
import numpy as np
from sklearn.preprocessing import MinMaxScaler,StandardScaler
### Machine LearningAction Chapter2 rewrite
def file2matrix(filename):
data= np.genfromtxt(filename,delimiter="\t")
returnMat=data[:,0:3]
classLabelVector=data[:,3:4]
return returnMat,classLabelVector
def autoNorm(dataset):
x = dataset[:, 0:1]
#method1 用skit-learn封装的MinMaxScaler处理
minMax = MinMaxScaler()
x_std = minMax.fit_transform(x)
print(x.min())
print(x.max())
print(x[2])
print((26052-0)/91273)
print(x_std[2])
##method2 use lambda
a = lambda x: (x -x.min())/(x.max()-x.min())
print(a(x)[2])
if __name__ =='__main__':
returnMat,classLabelVector=file2matrix('F:\\datingTestSet2.txt')
autoNorm(returnMat)
执行结果:
数据集示意: