Quantcast
Viewing all articles
Browse latest Browse all 34

Hierarchical Density-Based Clustering of Categorical Data and a Simplification

Abstract

A challenge involved in applying density-based clustering to categorical datasets is that the ‘cube’ of attribute values has no ordering defined. We propose the HIERDENC algorithm for hierarchical density-based clustering of categorical data. HIERDENC offers a basis for designing simpler clustering algorithms that balance the tradeoff of accuracy and speed. The characteristics of HIERDENC include: (i) it builds a hierarchy representing the underlying cluster structure of the categorical dataset, (ii) it minimizes the user-specified input parameters, (iii) it is insensitive to the order of object input, (iv) it can handle outliers. We evaluate HIERDENC on small-dimensional standard categorical datasets, on which it produces more accurate results than other algorithms. We present a faster simplification of HIERDENC called the MULIC algorithm. MULIC performs better than subspace clustering algorithms in terms of finding the multi-layered structure of special datasets.


Viewing all articles
Browse latest Browse all 34

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>