***
Abstract:
This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers.
We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm.
The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means.
Our proposed algorithm is validated using seven text datasets.
Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.
***
No comments:
Post a Comment