Information Theoretic Clustering via Divergence Maximization among Clusters

Sahil Garg, Mina Dalirrooyfard, Anderson Schneider, Yeshaya Adler, Yuriy Nevmyvaka, Yu Chen, Fengpei Li, Guillermo Cecchi

Research output: Contribution to journalConference articlepeer-review


Information-theoretic clustering is one of the most promising and principled approaches to finding clusters with minimal apriori assumptions. The key criterion therein is to maximize the mutual information between the data points and their cluster labels. Such an approach, however, does not explicitly promote any type of inter-cluster behavior. We instead propose to maximize the Kullback-Leibler divergence between the underlying data distributions associated to clusters (referred to as cluster distributions). We show it to entail the mutual information criterion along with maximizing cross entropy between the cluster distributions. For practical efficiency, we propose to empirically estimate the objective of KL-D between clusters in its dual form leveraging deep neural nets as a dual function approximator. Remarkably, our theoretical analysis establishes that estimating the divergence measure in its dual form simplifies the problem of clustering to one of optimally finding k − 1 cut points for k clusters in the 1-D dual functional space. Overall, our approach enables linear-time clustering algorithms with theoretical guarantees of near-optimality, owing to the submodularity of the objective. We show the empirical superiority of our approach w.r.t. current state-of-the-art methods on the challenging task of clustering noisy timeseries as observed in domains such as neuroscience, healthcare, financial markets, spatio-temporal environmental dynamics, etc.

Original languageEnglish
Pages (from-to)624-634
Number of pages11
JournalProceedings of Machine Learning Research
StatePublished - 2023
Externally publishedYes
Event39th Conference on Uncertainty in Artificial Intelligence, UAI 2023 - Pittsburgh, United States
Duration: 31 Jul 20234 Aug 2023


Dive into the research topics of 'Information Theoretic Clustering via Divergence Maximization among Clusters'. Together they form a unique fingerprint.

Cite this