Abstract
Frequent subgraph mining (FSM) has important applications in areas such as bioinformatics, social networks and others. In this paper, we present a highly scalable approach called ParGraph that can efficiently mine from a single graph in both distributed as well as shared-memory based systems. In a distributed environment, we can leverage the local memory of multiple compute nodes for storing a large number of intermediate states for enumerating patterns. To address the skewness in the pattern generation tree, our approach uses a novel hybrid load balancing scheme to efficiently distribute workload across both processes and threads. Our experiments demonstrate good speedups using message passing interface (MPI) and OpenMP threads.