Greater Seattle Area
My research career started with a question I have been chasing for almost a decade: what can machines genuinely understand from video, images, and language, and how do we make that understanding work in high-stakes, real-world situations? Early work on action recognition, intent detection from videos, and Bengali image captioning (the first such system in that language) gave me a foundation in computer vision and large language modeling research. A collaboration with Carnegie Mellon's Robotics Institute on aerial perception and scene augmentation introduced me to autonomous systems. A full-time research role at the AGenCy Lab allowed me to work on video captioning, Bengali speech synthesis, and activity recognition from sensor data. At the University of New Hampshire, I explored imitation learning: I designed a language-conditioned visual controller that teaches a robot arm to locate and reach target objects using kinesthetic demonstration and language-modulated visual attention, validated through a Human-Robot Interaction study with seven participants. That work was the bridge between the multimodal AI research and the autonomous systems focus of the doctorate. My PhD at Northwestern addressed a genuinely hard problem: how do you defend a city from weaponized drones, legally, in real time, with almost no historical threat data? Over four years I built four interconnected solutions. DEWS predicts whether a drone trajectory is threatening, validated on real Dutch Police trajectories over eight months in The Hague. STATE uses conditional generative networks to synthesize realistic drone trajectories for regions without threat data. POSS provides a deontic logic framework for legally compliant multi-objective decision-making. GUARDIAN coordinates defensive drone swarms via multi-agent reinforcement learning while enforcing legal constraints at every step. A key finding: legal constraints can actually improve tactical performance when defenders are outnumbered. At Zillow I built a Multimodal Generative AI solution that translates indoor panorama images into natural language and enables language-based home search (patent pending). At Microsoft I now work on machine learning and large language models at the scale of Windows operational data. I am open to advising startups, companies, and teams working on hard AI problems in multimodal perception, reasoning, language, and autonomous systems. For early stage companies, I am also open to formal advisory board roles. Let's chat: https://cal.com/tonmoay/coffee-chat Email: [email protected]
Member of the Windows Update Platform, working on natural language processing (NLP) and machine learning for large-scale operational data. I develop models and systems that extract signal from petabyte-scale Windows telemetry, using NLP pipelines, large language models, and predictive ML.
Serving as advisor and board member to an early-stage stealth AI startup. Contributing to product strategy, AI technical direction, and company governance at the board level.
Second of two consecutive summer internships on the Windows Update Platform team. Applied NLP and ML research on large-scale operational data. Continued and extended work from the previous summer internship. Led to a full-time offer as Data and Applied Scientist II, starting January 2026.
COMP_SCI 348: Intro to Artificial Intelligence | Spring 2025 | ~150 students https://www.mccormick.northwestern.edu/computer-science/academics/courses/descriptions/348.html Served as Graduate Teaching Assistant for the Intro to Artificial Intelligence course at Northwestern University. Responsibilities included leading discussion sections, holding office hours, grading assignments and exams, and supporting student learning in topics such as search, knowledge representation, planning, and machine learning fundamentals.
Lab: Northwestern Security & AI Lab (NSAIL) Website: https://sites.northwestern.edu/nsail/about Advisor: Dr. V.S. Subrahmanian (https://vssubrah.github.io) Projects: 1. DUCK: https://sites.northwestern.edu/nsail/projects/duck/ 2. Video Deception: https://sites.northwestern.edu/nsail/projects/video-deception/
COMP_SCI 349: Machine Learning | Fall 2024 | ~160 students https://www.mccormick.northwestern.edu/computer-science/academics/courses/descriptions/349.html Served as Graduate Teaching Assistant for the Machine Learning course at Northwestern University. Responsibilities included leading discussion sections, holding office hours, grading assignments and exams, and supporting student learning in topics such as supervised learning, neural networks, and model evaluation.