Research Intern - May 2022 - Aug 2022
- Optimized the scheduling of GP-GPU kernels to accelerate graph-based applications, enhancing performance and efficiency.
- Developed and implemented the CPD-ALS framework for efficient tensor decomposition, improving computational throughput.
- Conducted in-depth analysis to identify performance bottlenecks in GPU execution for graph applications.
- Discovered optimization strategies for matrix multiplications involving tall and wide matrices, significantly boosting overall performance.
Junior Research Fellow - Jan 2019 - Aug 2021
- Conducted design space exploration for NB-LDPC codes on FPGAs.
- Developed accelerators for sparse matrix multiplication.
- Appointed as a Visiting Research Fellow at the Instituto de Telecomunicações, University of Coimbra, from March 2021 to June 30th.