Hardware Accelerator Integration Tradeoffs for High-Performance Computing: A Case Study of GEMM Acceleration in N-Body Methods, Mochamad Asri, Dhairya Malhotra, Jiajun Wang, George Biros, Lizy K. John, Andreas Gerstlauer. IEEE Transactions on Parallel and Distributed Systems, 2021 (pdf)
A scalable computational platform for particulate Stokes suspensions, Wen Yan, Eduardo Corona, Dhairya Malhotra, Shravan Veerapaneni, Michael Shelley. Journal of Computational Physics, 2020
Efficient High-Order Singular Quadrature Schemes in Magnetic Fusion, Dhairya Malhotra, Antoine Cerfon, Michael O’Neil, Evan Toler. Plasma Physics and Controlled Fusion, 2019
Taylor States in Stellarators: A Fast High-Order Boundary Integral Solver, Dhairya Malhotra, Antoine Cerfon, Lise-Marie Imbert-Gérard, Michael O’Neil. Journal of Computational Physics, 2019
Algorithm 967: A Distributed-Memory Fast Multipole Method for Volume Potentials, Dhairya Malhotra, George Biros. ACM Transactions on Mathematical Software, 2016
FFT, FMM, or Multigrid? A comparative Study of State-Of-the-Art Poisson Solvers for Uniform and Nonuniform Grids in the Unit Cube, Amir Gholami, Dhairya Malhotra, Hari Sundar, and George Biros. SIAM Journal on Scientific Computing, 2016
PVFMM: A Parallel Kernel Independent FMM for Particle and Volume Potentials, Dhairya Malhotra, George Biros. Communications in Computational Physics, 2015
A Parallel Arbitrary-Order Accurate AMR Algorithm for the Scalar Advection-Diffusion Equation, Arash Bakhtiari, Dhairya Malhotra, Amir Raoofy, Miriam Mehl, Hans-Joachim Bungartz, George Biros. Proc. ACM/IEEE Supercomputing, Salt Lake City, UT. 2016
Performance Analysis of HPC Applications with Irregular Tree Data Structures, Ahmed Khawaja, Jiajun Wang, Dhairya Malhotra, Andreas Gerstlauer, George Biros and Lizy John. Proc. IEEE International Conference on Parallel and Distributed Systems, Hsinchu, Taiwan, 2014
A volume integral equation Stokes solver for problems with variable coefficients, Dhairya Malhotra, Amir Gholami, George Biros. Proc. ACM/IEEE Supercomputing, New Orleans, LA. 2014. Finalist, Best Student Paper
Algorithms for High-Throughput Disk-to-Disk Sorting, Hari Sundar, Dhairya Malhotra, Karl W. Schulz. Proc. ACM/IEEE Supercomputing, Denver, CO. 2013
HykSort: a new variant of hypercube quicksort on distributed memory architectures, Hari Sundar, Dhairya Malhotra, George Biros. Proc. 27th International Conference on Supercomputing, Eugene, OR. 2013
Petascale Direct Numerical Simulation of Blood Flow on 200K Cores and Heterogeneous Architectures, Abtin Rahimian, Ilya Lashuk, Shravan Veerapaneni, Aparna Chandramowlishwaran, Dhairya Malhotra, Logan Moon, Rahul Sampath, Aashay Shringarpure, Jeffrey Vetter, Richard Vuduc, Denis Zorin, George Biros. Proc. ACM/IEEE Supercomputing, New Orleans, LA. 2010. Winner, Gordon Bell Prize
A parallel algorithm for long-timescale simulation of concentrated vesicle suspensions in three dimensions, Dhairya Malhotra, Abtin Rahimian, Denis Zorin, George Biros.
AccFFT: A library for distributed-memory FFT on CPU and GPU architectures, Amir Gholami, Judith Hill, Dhairya Malhotra, George Biros. arXiv:1506.07933, 2015