The Art of Translation
What if your model could just look where it needed to? Exploring attention mechanisms in sequence-to-sequence learning and how they revolutionize machine translation.
Computer Science and Mathematics at Georgia Tech. Focused on Machine Learning and Mathematical Modeling.
Most recently I led the software stack at Swish Robotics. Prior to that I worked on Fraud Detection Models and Infrastructure at Credit Karma and Point of Sale (PoS) systems at NCR. Former competitive swimmer. Indian National Math Olympiad finalist and a Top 300 rank on the Putnam Math Contest.
What if your model could just look where it needed to? Exploring attention mechanisms in sequence-to-sequence learning and how they revolutionize machine translation.
Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.
The Exploration vs. Exploitation Dilemma and a brief introduction to Reinforcement Learning
Worked with Prof. Clio Andris on geographic visualizations and spatial information theory.
A Design Space of Node Placement Methods for Geospatial Network Visualizations.
Implemented LoRA adapters on a 7B Mistral Model and distilled into a smaller student.
Benchmarking Flash Attention v1 and v2 in Triton against a naive PyTorch implementation of Scaled Dot Product Attention and Multi Headed Attention.
Trained a sequence-to-sequence model with and without the attention mechanism to translate natural language to Python snippets.