Optymalizacja liczby skupień na podstawie wybranych wskaźników jakości grupowania
The Optimisation of Cluster Number on the Basis of Selected Cluster Validity Indexes
Author(s): Anna BryjaSubject(s): Economy
Published by: Wydawnictwo Uniwersytetu Ekonomicznego w Krakowie
Keywords: clustering; cluster analysis; cluster validity index
Summary/Abstract: Selecting the number of clusters to use is one of the biggest problems in cluster analysis. Numerous methods to help one to choose the best number of clusters have been published. Their effectiveness is usually evaluated on the basis of the results of clustering data sets, which contain a known number of groups. This paper presents methods to help determine the best number of clusters. They are presented and used in the analysis of a large data set: 5 cluster validity indexes (Caliński and Harabasz, Hubert and Levine, Dunn, Davies and Bouldin, Rousseeuw) and cross-validation – stability were measured by corrected Rand index. The usefulness of these techniques was then compared and evaluated.
Journal: Zeszyty Naukowe Uniwersytetu Ekonomicznego w Krakowie
- Issue Year: 892/2012
- Issue No: 16
- Page Range: 53-67
- Page Count: 15
- Language: Polish