Comprehensive Review of K-Means Clustering Algorithms

Authors

  • Eric U. Oti Department of Statistics, Federal Polytechnic Ekowe, Bayelsa State, Nigeria
  • Michael O. Olusola Nnamdi Azikiwe University, Awka Anambra State, Nigeria
  • Francis C. Eze Nnamdi Azikiwe University, Awka Anambra State, Nigeria
  • Samuel U. Enogwe Michael Okpara University of Agriculture, Umudike, Nigeria

DOI:

https://doi.org/10.31695/IJASRE.2021.34050

Keywords:

Centroid, Cluster Analysis, Euclidean Distance, K-Means

Abstract

This paper presents a comprehensive review of existing techniques of k-means clustering algorithms made at various times. The k-means algorithm is aimed at partitioning objects or points to be analyzed into well separated clusters. There are different algorithms for k-means clustering of objects such as traditional k-means algorithm, standard k-means algorithm, basic k-means algorithm and the conventional k-means algorithm, this are perhaps the most widely used versions of the k-means algorithms. These algorithms uses the Euclidean distance as its metric and minimum distance rule approach by assigning each data points (objects) to its closest centroids.

References

M. R. Anderberg, “Cluster Analysis for Applications”, New York: Academic Press, (1973).

S. Brohe ́e and J. V. Helden, “Evaluation of clustering algorithms for protein-protein interaction networks”, BMC Bioinformatics, (2006), 7(1): 488.

B. Everitt, S. Landau, M Leese. and D. Stahl, Cluster Analysis, 5thed.,, John Wiley and Sons, (2011).

R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 5th ed.,, Eaglewood Cliffs, NJ: Prentice-Hall, (2002).

H. E. Driver and A. L. Kroeber , “Quantitative Expression of Cultural Relationships,” University of California Publications of American Archaeology and Ethnology, 1932, vol. 4, pp.211-256.

J. Zubin, “A technique for measuring like-mindedness”, The Journal of Abnormal and Social Psychology. 1938, vol. 4. pp508-516.

R. Cattell, “r_p and other coefficients of pattern similarity”, Psychometrika, 1949, vol. 4. pp.279-298.

R. R. Sokal and P. H. A.. Sneath, Principles of Numerical Taxonomy. San Francisco: California, (1963).

W. D. Fisher, Clustering and Aggregation in Economics, Johns Hopkins Press, Baltimore, Maryland., (1968)

R. C. Tryon and D. E. Bailey, Cluster Analysis,. McGraw Hill, New York, (1970)

N. Jardine and R Sibson,. Mathematical Taxonomy, John Wiley and Sons, Ltd, Chichester, (1871)

J. A. Hartigan ,Clustering Algorithms, John Wiley & Sons. Inc., New York, (1075).

H. Spa ́th, Cluster Analysis Algorithms. West Sussex, UK: Ellis Horwood Limited, (1980).

H. Spa ́th, Cluster Dissection and Analysis: theory, FORTRAN programs, examples. (Translator: Johannes Goldschmidt.) Ellis Horwood Ltd Wiley, Chichester, (1985).

M. S. Aldenderfer, and, R. K.. Blashfield, Cluster Analysis. Beverly Hills, CA: Sage Publications, (1984).

C. Romesburg, Cluster Analysis for Researchers, London, Wadesworth, (1984).

K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd ed., Academic Press, (1990).

L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons, Inc. New York, (1990).

P. Berkhin, A Survey of Clustering Data Mining Techniques. In: Kogan J., Nicholas C., Teboulle M. (eds) Grouping Multidimensional Data, Springer, Berlin, Heidelberg, (2006).

B. Mirkin, Clustering: A Data Recovery Approach, 2nd ed., Chapman and Hall/CRC, (2013).

N. Slonim, E. Aharoni and K. Crammer, “Hartigan’s K-Means Versus Lloyd’s K-Means-Is It Time for a Change? Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, 2013, pp. 1677-1684.

A. K. Jain and R. Dubes, Algorithms for Clustering Data. Englewood Cliffs, NJ: Prentice-Hall, (1988).

V. Estivill-Castro Why so many clustering algorithms: a position paper. ACM SIGKDD Explorations Newsletter, 2002, 4(1), 65-75.

J. MacQueen, “Some methods for classification and analysis of multivariate observations”. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1967, vol 1, 281-297.

C. Yuan and H. Yang, “Research on k-value selection method of k-means clustering algorithm”,. Multidisciplinary Scientific Journal, 2019, 2(2), 226-235.

T. Hastie, R. Tiiranibsh and J. Friedman., The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed., Springer-Verlag, (2009).

E. W. Forgy, “Cluster analysis of multivariate data: efficiency versus interpretability of classifications”, Biometrics, 1965, vol. 21, 768-769.

S. Lloyd, “Least squares quantization in PCM. IEEE Transaction on Information Theory”, 1982, 28(2), 129-137.

J. A.Hartigan and M. A. Wong, “Algorithm AS 136: A k-means clustering algorithm”, Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979, 28(1):100-108.

R. C. Jancey, “Multidimensional group analysis”, Australian Journal of Botany, 1966, 14(1), 127-130.

K. Krishna and M. Murty, “Genetic K-Means algorithm”, IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 1999, 29(3): 433-439.

S. Bandyopadhyay and U. Maulik, “An evolutionary technique based on k-means algorithm for optimal clustering in R^N”, Information Science, 2002, vol. 146, pp. 221-237.

A. Likas, N. Vlassis and J. Verbeek, “The global k-means clustering algorithm”, Pattern Recognition 2003, 36(2), 451-461.

V. Faber , “Clustering and the continuous k-means algorithm”, Los Alamos Science, 1994, vol. 22, 138-144.

T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman and A. Wu, “An efficient k-means clustering algorithm: Analysis and Implementation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7), 881-892.

J. L. Bentley, “Multidimensional binary search trees used for associative searching: Communication of the ACM, 1975, 18(9), 509-517.

A. M. Bagirov and K. Mardaneh, “Modified global k-means algorithm for clustering in gene expression datasets Conference Proceedings Workshop on Intelligent Systems for Bioinformatics, 2006, Vol. 73: 23-28.

K. A. A Nazeer. and M. P. Sebastian, “Improving the accuracy and efficiency of the k-means clustering algorithm, Proceedings of the World Congress on Engineering, 2009, vol. 1, 1-3.

A. M. Fahim A. M. Salem, F. A. Torkey and M. A. Ramadan, ”An Efficient enhanced k-means clustering algorithm,” Journal of Zhejiang University, 2006, 10(7): 1626-1633.

J. Z. Huang M. K. Ng, H. Rong and Z. Li, “Automated variable weighting in k-means type clustering”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5), 657-668.

R. C. Amorim, “Constrained clustering with Minkowski weighted k-means”, In: Proceedings of the 13th IEEE International Symposium on Computational Intelligence and Informatics, 2012, pp.13-17.

R. C. Amorim and B. Mirkin, “Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering”, Pattern Recognition, 2012, vol. 45, pp.1061-1075.

K. Wagstaff, C. Cardie, S. Rogers and S. Schroedl, “Constrained k-means clustering with background knowledge”, In Proceedings of the 8th International Conference on Machine Learning. 2001, pp. 577-584.

E. U. Oti, S. I. Onyeagu and R. A. Slink, “A modified k-means clustering method for effective updating of cluster centroid”, Journal of Basic Physical Research, 2019, 9(2), 123-137.

Downloads

How to Cite

Eric U. Oti, Michael O. Olusola, Francis C. Eze, & Samuel U. Enogwe. (2021). Comprehensive Review of K-Means Clustering Algorithms. International Journal of Advances in Scientific Research and Engineering (IJASRE), ISSN:2454-8006, DOI: 10.31695/IJASRE, 7(8), 64–69. https://doi.org/10.31695/IJASRE.2021.34050

Issue

Section

Articles