Media

How many emotions do humans have?

What do you think?

High-Dimensional Text Clustering by Dimensionality Reduction and Improved Density Peak

***

Abstract:

This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. 

We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. 

The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. 

Our proposed algorithm is validated using seven text datasets. 

Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.

*** 

https://www.hindawi.com/journals/wcmc/2020/8881112/

No comments: