We will cover K-Means and Gpairs as examples to demonstrate the implementation of these algorithms with live sample code on the Intel DevCloud and/or JLSE.
K-means is a clustering algorithm that partitions observations from a dataset into a requested number of geometric clusters of points closest to the cluster’s own center of mass. Using an initial estimate of the centroids, the algorithm iteratively updates the positions of the centroids until a fixed point. Intel Extension for Scikit-learn* provides an optimized K-Means clustering algorithm.
The Gpairs distance application takes a set of multidimensional points and computes the Euclidean distance between every pair of points. The algorithm Naively counts Npairs(<r), the total number of pairs that are separated by a distance less than r, for each r**2 in the provided input.
The talk covers how to calculate the above algorithms using the @Numba JIT method and using @kernel decorator.