R中的聚类分析:确定最优聚类数作为R领域的新手,我不太确定如何选择最佳的簇数来进行k-均值分析。在绘制了以下数据的子集之后,有多少个集群才是合适的呢?如何进行聚类树分析?n = 1000kk = 10 x1 = runif(kk)y1 = runif(kk)z1 = runif(kk)
x4 = sample(x1,length(x1))y4 = sample(y1,length(y1)) randObs <- function(){
ix = sample( 1:length(x4), 1 )
iy = sample( 1:length(y4), 1 )
rx = rnorm( 1, x4[ix], runif(1)/8 )
ry = rnorm( 1, y4[ix], runif(1)/8 )
return( c(rx,ry) )} x = c()y = c()for ( k in 1:n ){
rPair = randObs()
x = c( x, rPair[1] )
y = c( y, rPair[2] )}z <- rnorm(n)d <- data.frame( x, y, z )
3 回答
![?](http://img1.sycdn.imooc.com/545845d30001ee8a02200220-100-100.jpg)
青春有我
TA贡献1784条经验 获得超8个赞
identify
d_dist <- dist(as.matrix(d)) # find distance matrix plot(hclust(d_dist)) clusters <- identify(hclust(d_dist))
identify
cutree
).
![?](http://img1.sycdn.imooc.com/545861b80001d27c02200220-100-100.jpg)
慕码人2483693
TA贡献1860条经验 获得超9个赞
Elbow
弯头法
elbow.k <- function(mydata){dist.obj <- dist(mydata)hclust.obj <- hclust(dist.obj)css.obj <- css.hclust(dist.obj,hclust.obj)elbow.obj <- elbow.batch(css.obj)k <- elbow.obj$kreturn(k)}
运行弯头并联
no_cores <- detectCores() cl<-makeCluster(no_cores) clusterEvalQ(cl, library(GMD)) clusterExport(cl, list("data.clustering", "data.convert", "elbow.k", "clustering.kmeans")) start.time <- Sys.time() elbow.k.handle(data.clustering)) k.clusters <- parSapply(cl, 1, function(x) elbow.k(data.clustering)) end.time <- Sys.time() cat('Time to find k using Elbow method is',(end.time - start.time),'seconds with k value:', k.clusters)
- 3 回答
- 0 关注
- 2087 浏览
添加回答
举报
0/150
提交
取消