Title: Fast algorithms for dimensionality reduction and data visualization using t-distributed Stochastic Neighborhood Embedding (t-SNE)
In this talk, I will present a class of algorithms for the rapid construction of low dimensional embeddings; the algorithms use t-SNE and are based on a combination of polynomial interpolation and Fast Fourier Transforms (FFTs). Over the last decade, t-SNE has become an increasingly popular technique for the visualization and analysis of single cell RNA sequencing (scRNA-seq) data. Unfortunately, existing methods for t-SNE are extremely computationally expensive whenever the datasets contain hundreds of thousands to millions of points, as often encountered in scRNA-seq datasets. The most time-consuming step of t-SNE is the repeated evaluation of N-body interactions with smooth kernels. We exploit the smoothness of these kernels to construct efficient low-rank representations using polynomial interpolation which can be rapidly applied using FFTs. The resulting algorithm scales linearly with the number of points and is roughly 15 times faster than existing state-of-the-art methods; thereby reducing the computation time to less than an hour on a laptop for most datasets encountered in practice. We illustrate the performance of the method with several numerical examples.