Data Analysis Methods and Applications: Hyperspectral Band Selection and Data Classification on Embedded Grassmannians

Sofya Chepushtanova
Seminar

Abstract:
This talk considers two topics that are united by the areas of geometric data analysis and classification. I first consider the hyperspectral band selection problem solved by using sparse support vector machines (SSVMs). Band selection is a frequently used dimensionality reduction technique for hyperspectral imagery. It identifies bands (features) that contain the most discriminatory information and use them for further analysis. We propose a supervised embedded approach using the property of SSVMs to exhibit a model structure that includes a clearly identifiable gap between zero and non-zero band weights that permits important bands to be definitively selected in conjunction with the classification problem.  The second project represents an approach for performing set-to-set pattern recognition, via classification of data on embedded Grassmannians.  A set of points from a given class characterizes the variability of the class information, so we propose organizing sets of data as points on a Grassmann manifold. It is modeled by sampling data from each class and then encoding these samples as subspaces. We use a natural choice of metric for computing distances between points, namely, the geodesic (or arc-length) distance defined on Grassmannians. We use it to construct the distance matrix and then perform the classical multidimensional scaling (cMDS) to find a Euclidean embedding of the points on the manifold. In the new space we apply sparse support vector machines (SSVMs) for classification and identification of optimal dimensions of embedded subspaces. In conclusion, I will talk about other directions of interest in the areas of machine learning and topological data analysis.