Frequent Subgraph Mining

Application Description

This benchmark finds frequent subgraphs in an undirected graph. Arguments: k, the maximum size of subgraphs, i.e., k edges; minsup, minimum support, i.e., the threshold of support of frequent patterns.

Algorithm

The algorithm is called extend-reduce. A initial worklist contains initial embedding (single edge). It extends each embedding with an edge, and insert the new embeddings to the next worklist. Extension is repeated until the size of embedding is k (edges). Each embedding is checked to identify its pattern, and the domain support (MNI) is accumulated into the support map. The patterns with support below threshold are removed at the end of each extension, as well as all their embeddings.

Performance

k=2, minsup=300

Graph Time (s)
mico 4.2
patent_citations 15.9

Please read the following paper for detailed performance evaluation:

(VLDB 2020) [PDF] [Code]
Xuhao Chen, Roshan Dathathri, Gurbinder Gill, Keshav Pingali,
Pangolin: An Efficient and Flexible Graph Mining System on CPU and GPU,
PVLDB, 13(8): 1190-1205, 2020

Machine Description

Performance numbers are collected on a 4 package (14 cores per package) Intel(R) Xeon(R) Gold 5120 CPU machine at 2.20GHz from 1 thread to 56 threads in increments of 7 threads. The machine has 192GB of RAM. The operating system is CentOS Linux release 7.5.1804. All runs of the Galois benchmarks used gcc/g++ 7.2 to compile.