Abstract
Machine learning has seen tremendous growth in recent years thanks to two key advances in technology: massive data generation and highly-parallel accelerator architectures. The rate that data is being generated is exploding across multiple domains, including medical research, environmental science, web-search, and e-commerce. Many of these advances have benefited from emergent web-based applications, and improvements in data storage and sensing technologies. Innovations in parallel accelerator hardware, such as GPUs, has made it possible to process massive amounts of data in a timely fashion. Given these advanced data acquisition technology and hardware, machine learning researchers are equipped to generate and sift through much larger and complex datasets quickly. In this work, we focus on accelerating Kernel Dimension Alternative Clustering algorithms using GPUs. We conduct a thorough performance analysis by using both synthetic and real-world datasets, while also modifying both the structure of the data, and the size of the datasets. Our GPU implementation reduces execution time from minutes to seconds, which enables us to develop a web-based application for users to, interactively, view alternative clustering solutions.