假设随机生成两维样本数据,然后用kmeans算出聚类并标记出质心:
r1=randn(5,2)-2
r2=randn(5,2)+2
X=[r1;r2];
opts = statset('Display','final');
[idx,C] = kmeans(X,2,'Replicates',2,'Options',opts);
C%质心
cluster1=[X(idx==1,1),X(idx==1,2)] %簇1
cluster2=[X(idx==2,1),X(idx==2,2)] %簇2
figure;
plot(X(idx==1,1),X(idx==1,2),'g^','MarkerSize',5);
hold on;
plot(X(idx==2,1),X(idx==2,2),'b.','MarkerSize',15);
hold on;
plot(C(:,1),C(:,2),'r.','MarkerSize',30,'LineWidth',3);
hold on;
legend('聚类C1','聚类C2','质心','Location','NW');
运行产生的随机试验数据:
r1 =
-1.9197 -1.5771
-2.8738 -2.0885
-2.4520 -2.2704
-1.9906 0.0474
-3.3071 -0.5084
r2 =
1.0730 2.0887
2.7819 1.6868
3.1847 1.7797
2.9300 2.6951
2.9886 2.4277
Replicate 1, 2 iterations, total sum of distances = 9.16468.
Replicate 2, 2 iterations, total sum of distances = 9.16468.
Best total sum of distances = 9.16468
C =
2.5916 2.1356
-2.5086 -1.2794
cluster1 =
1.0730 2.0887
2.7819 1.6868
3.1847 1.7797
2.9300 2.6951
2.9886 2.4277
cluster2 =
-1.9197 -1.5771
-2.8738 -2.0885
-2.4520 -2.2704
-1.9906 0.0474
-3.3071 -0.5084
r1和r2是随机产生的原始二维坐标点。经过K-means聚类后,根据两个质心,归入到两簇:C1簇和C2簇。 结果如图:
附录:
1,《人工智能AI常见的经典K-means聚类算法原理和工作过程》
2,《数值分析Matlab二维正态(高斯)分布以及协方差矩阵》