News
- (12/25) I will be attending NeurIPS in San Diego with posters (on variational inference, cosmological data analysis, and neuronal graph-matching) that neatly convey the mix of methodological and applied work in CCM. For those interested in this type of work, we are hiring at all levels.
- (11/25) I will be speaking at the Workshop on the Physics of John Hopfield: Learning and Intelligence, to be held at Princeton.
- (10/25) A new paper of ours, on variational inference for uncertainty quantification, has been accepted to JMLR.
- (9/25) New job ads in CCM have been posted for postdoctoral fellows and
research scientists at all levels of seniority. There is also an opening for
a joint position as an associate research scientist in CCM and
a tenure-track faculty
member in the Computer Science Department at Cooper Union.
- (09/25) Two new papers of ours, on variational inference with Feynman parameterizations and benchmarks for cosmological data analysis, have been accepted to NeurIPS.
- (09/25) I will be speaking at the Workshop on Low-Rank Models and Applications, to be held at the University of Mons.
- (07/25) I will be speaking at the Workshop on Geometric and Combinatorial Methods in the Foundations of CS and AI, to be held at Oxford.
- (5/25) Charles Margossian
and I have received the best paper award at this year's AI-STATS conference in Phuket for our work on variational inference in location-scale families.
- (03/25) Daniel Lee and I have won the Flywire VNC Matching Challenge.
- (03/25) Congratulations to my colleague and co-author Charles Margossian, who will start next fall
as an assistant professor in the Department of Statistics at the University of British Columbia.
Recent projects
High dimensional data analysis
Sparse matrices are not generally low rank, and low-rank matrices are not generally sparse. But can one find more subtle connections between these different properties of matrices by looking beyond the canonical decompositions of linear algebra? This paper in SIMODS describes a nonlinear matrix decomposition that can be used to express a sparse nonnegative matrix in terms of a real-valued matrix of significantly lower rank. Arguably the most popular matrix decompositions in machine learning are those—such as principal component analysis, or nonnegative matrix factorization—that have a simple geometric interpretation. This paper in TMLR gives such an interpretation for these nonlinear decompositions, one that arises naturally in the problem of manifold
learning.
Graph matching
The problem of graph matching is to find a permutation that maps the nodes of one graph onto the nodes of another. Such a mapping can be discovered by a combinatorial optimization that searches over the space of permutation matrices, by a continuous relaxation that searches over the space of doubly stochastic matrices, or by an interleaving of both these approaches. In March 2025, it was announced that Dan Lee and I won the Flywire VNC Matching Challenge hosted by the Princeton Neuroscience Institute; the name of our team was Old School. The challenge asked participants to construct a method for aligning the connectomes of a male and female fruit fly. The connectomes were represented as large graphs, and by matching these graphs, we determined a correspondence between neurons in the male and female ventral nerve cords.
Variational inference
Given an intractable distribution p, the problem of variational inference (VI) is to find the best approximation q from some more tractable family.
Typically, q is found by minimizing the (reverse) Kullback-Leibler divergence, but in recent papers at ICML and NeurIPS, we have shown how to approximate p by minimizing certain score-based divergences. The first of these papers derives the Batch and Match algorithm for VI with multivariate Gaussian approximations. The second describes an eigenvalue problem (EigenVI) for approximations based on orthogonal function expansions. Finally, the third uses a Feynman identity from quantum field theory to approximate the target density by a product of experts. In related work, this paper in JMLR analyzes the inherent trade-offs in VI with a factorized approximation, and this paper at AISTATS provides some positive guarantees for VI with location-scale families.
Learning with symmetries: weight-balancing flows
Gradient descent is based on discretizing a continuous-time flow, typically one that descends in a regularized loss function. But what if for all but the simplest types of regularizers we have been discretizing the wrong flow? This paper in TMLR makes two contributions to our understanding of deep learning in feedforward networks with homogeneous activations functions (e.g., ReLU) and rescaling symmetries. The first is to describe a simple procedure for balancing the weights in these networks without changing the end-to-end functions that they compute. The second is to derive a continuous-time dynamics that preserves this balance while descending in the network's loss function. These dynamics reduce to an ordinary gradient flow for l2-norm regularization, but not otherwise. Put another way, this analysis suggests a canonical pairing of alternative flows and regularizers.
Recent papers
- D. Cai, R. M. Gower, D. M. Blei, and L. K. Saul (2025). Fisher meets Feynman: score-based variational inference with a product of experts. To appear in Advances in Neural Information Processing Systems (NeurIPS) 38. (Spotlight presentation)
- N. Huang, R. Stiskalek, J.-Y. Lee, A. E. Bayer, C. C. Margossian, C. K. Jespersen, L. A. Perez, L. K. Saul, and F. Villaescusa-Navarro (2025). CosmoBench: a multiscale, multiview, multitask cosmology benchmark for geometric deep learning. To appear in Advances in Neural Information Processing Systems (NeurIPS) 38: Datasets and Benchmarks Track.
- C. C. Margossian, L. Pillaud-Vivien, and L. K. Saul (2025). Variational inference for uncertainty quantification: an analysis of trade-offs. To appear in the Journal of Machine Learning Research (JMLR).
- C. C. Margossian and L. K. Saul (2025). Variational inference in location-scale families: exact recovery of the mean and correlation matrix. Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS).
PMLR 258:3466-3474.
(Best paper award)
- C. Modi, D. Cai, and L. K. Saul (2025). Batch, match, and patch: low-rank approximations for score-based variational inference. Proceedings of the 28th International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR 258:4510-4518.
- D. Cai, C. Modi, C. C. Margossian, R. M. Gower, D. M. Blei, and L. K. Saul (2024). EigenVI: score-based variational inference with orthogonal function expansions. In Advances in Neural Information Processing Systems (NeurIPS) 37, pages 132691--132721. (Spotlight presentation)
- D. Cai, C. Modi, L. Pillaud-Vivien, C. C. Margossian, R. M. Gower, D. M. Blei, and L. K. Saul (2024). Batch and match: black-box variational inference with a score-based divergence. In Proceedings of the 41st International Conference on Machine Learning (ICML). PMLR 235:5258-5297. (Spotlight presentation)
- C. Modi, C. C. Margossian, Y. Yao, R. M. Gower, D. M. Blei and L. K. Saul (2023).
Variational inference with Gaussian score matching.
In Advances in Neural Information Processing Systems (NeurIPS) 36, pages 29935-29950.
- L. K. Saul (2023).
Weight-balancing fixes and flows for deep learning.
Transactions on Machine Learning Research (09/2023).
- C. C. Margossian and L. K. Saul (2023).
The shrinkage-delinkage tradeoff: an analysis of factorized
Gaussian approximations for variational inference.
In Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI). PMLR 216:1358-1367. (Oral presentation)
-
L. K. Saul (2022).
A geometrical connection between sparse and low-rank matrices and its application to manifold learning.
Transactions on Machine Learning Research (TMLR) (12/2022).
-
L. K. Saul (2022).
A nonlinear matrix decomposition for mining the zeros of sparse data.
SIAM Journal of Mathematics of Data Science (SIMODS) 4(2):431-463.