Home

Mandar Deshpande

Model Optimization Engineer at Roblox. Previously at Meta, specializing in GPU kernel optimization, Triton compiler integrations with PyTorch, and large-scale deep learning infrastructure.

Current Work

Model Optimization Engineer @ Roblox (2025 – Present)

Past Work & Research

Systems ML Engineer @ Meta (2022 – 2025)

  • Contributing to Triton GPU compiler + PyTorch integrations with optimizations for GPU kernels (FlashAttention v2, FlexAttention) targeting H100 and B200 GPUs
  • Developing pure PyTorch-based preprocessing transforms for internal customers (Ads, IG, MRS, Feed) and handling model serialization, model splitting, and export via TorchScript and FX-trace
  • Authoring and optimizing PyTorch C++ operators and CUDA kernels for accelerated GPU computing

Software Engineer @ AWS Inferentia/Trainium (2021)

Building distributed training infrastructure for AWS Inferentia and Trainium. Developed Neuron SDK with graph capture and optimizations for FX-graph/TorchScript in PyTorch and TVM on TensorFlow.

Research Engineer @ AWS Rekognition (2020)

Designed and deployed anomaly detection models in PyTorch for manufacturing defects for Amazon Lookout for Vision. Built a fault visualization tool to validate model performance and feed corrections.

Machine Learning Engineer @ Citi (2017–2019)

Solved NLP and CV use cases using deep learning and traditional vision techniques. Utilized NLP for parsing financial documents and extracting relevant information.

Open Source

Google Summer of Code – Mentor (2018–2019)

  • TensorFlow (2019): Mentored Ryan Lee in developing a curiosity module for TF-Agents with Oscar Ramirez, implementing Random Network Distillation (RND)
  • Scilab (2018): Mentored Soumitra Agrawal on a machine learning toolbox for Scilab
  • Gensim (2018): Co-mentored Aneesh Joshi on neural network similarity learning for Gensim under NumFOCUS

Google Summer of Code – Student (2017)

Developed a modular ML toolbox for Scilab enabling remote execution of Python model training over the network with local inference. Project page


Only persistence fuels my learning engines and the dream is to always be in the journey towards excellence!

Contact