我想要获取平均值、最小值、最大值等值。标准差。对于使用 k-means 方法计算的每组簇。下面的代码正确吗? import pandas as pd from sklearn.cluster import KMeans dataset = pd.read_csv("C:/Users/../cardio_train_py.csv", sep=';') clusterDB_1 = dataset[['Age','BMI','cardio']].copy() kmeans = KMeans(n_clusters=8).fit(clusterDB_1) X=[0,1,2,3,4,5,6,7] print('Age mean() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(check['Age'].mean()) print('BMI mean() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(check['BMI'].mean()) print('cardio == 0 count() for each cluster') for x in X: check = clusterDB_1[kmeans.labels_ == x] print(len(check[check['cardio'] == 1]))我问这个是因为获得的值(例如年龄和BMI的平均值以及有氧运动计数== 0)与Statistica中获得的值不同(照片显示了程序Statistica结果的结果)下面是BMI的结果( Python计算)24.46858773626099624.04785593330728230.54886546867411631.9841046300499332.89129084635681166.5735714285714641.9784573748308524.16813400017246这是我的数据库=> https://www.easypaste.org/file/JcyGhA8Y/cardio.train.py.csv?lang=pl感谢您的所有帮助和提示:)
添加回答
举报
0/150
提交
取消