Parallel Algorithms for Scalable Graph Mining: Applications on Big Data and Machine Learning

Safrin Sattar, University of New Orleans
Webinar
AI for Science

Complex network analysis is an exciting area of research for many applications in different scientific domains, e.g., sociology, biology, online media, recommendation systems, and many more. Machine/Deep learning plays a significant role in working with big data in modern era. We discuss a well-known graph problem, community detection (CD). The speaker will present parallel algorithms for Louvain method for static networks showing around 12-fold speedups. The implementations use both shared-memory and distributed memory parallel algorithms. For CD in dynamic graphs, we have used permanence, a vertex-based metric.

We also develop a scalable method for CD based on Graph Convolutional Network (GCN) via semi-supervised node classification using PyTorch with CUDA on GPU environment (4x performance gain).  The model achieves up to 86.9% accuracy and 0.85 F1 Score on different real-world datasets from diverse domains. To extend our work on deep learning, we provide a scalable solution to the Sparse Deep Neural Network (DNN) Challenge by designing data parallel Sparse DNN using TensorFlow on GPU (4.7x speedup). We also include the applications of webspam detection from webgraphs (billions of edges) and sentiment analysis on social network, Twitter (1.2 million tweets) to reveal insights about COVID-19 vaccination awareness among the public to portray the importance of graph mining in our daily activities.