Anand Hegde

ML @Arm | ML @Samsung | Patents and Papers |

Mountain View, California, United States

About

I'm a Machine Learning Engineer specializing in model optimization, on-device inference, and edge deployment. I focus on model compression, hardware-efficient ML, and edge systems. I have hands-on experience taking models from research to production on real devices from implementing Vector Quantization for sub-4-bit LLM inference at Arm, to building on-device training pipelines for privacy-preserving anomaly detection at Samsung Research America, to deploying optimized CV models across 5 million Samsung devices globally. My work spans quantization (PTQ, VQ, LoRA), mobile runtimes (ExecuTorch, TFLite, ONNX), and multi-chipset optimization (Exynos, Snapdragon, MediaTek). I've also contributed to the research community through publications at CVPR and BMVC, and hold multiple patents.

Experience

Machine Learning Engineer at Arm
Jun 2026 - Present · 1 mo
Teaching Assistant at Northeastern University
Jan 2026 - Apr 2026 · 4 mos
Machine Learning Research Intern at Samsung Research America (SRA)
Sep 2025 - Dec 2025 · 4 mos
💡Technical Contributions: 1)Designed and implemented an on-device training pipeline for a privacy-preserving behavioral anomaly detection system, enabling continuous model personalization directly on edge devices without transmitting sensitive user data to the cloud. 2) Implemented and benchmarked three mobile ML runtimes — ExecuTorch, TFLite, and ONNX — for on-device LoRA fine-tuning, profiling training speed, inference latency, memory footprint, and battery consumption across Exynos, Snapdragon, and MediaTek chipsets. 3)Designed a runtime abstraction layer supporting multi-chipset NPU deployment, evaluating CPU, GPU, and NPU backends to identify optimal execution strategies for production deployment on Samsung devices. ➡️ Skills/Technologies: On-Device Training · ExecuTorch · TFLite · ONNX · LoRA Fine-tuning · Edge Inference · Anomaly Detection · Mobile ML · Android · Chipset Benchmarking · Privacy-Preserving ML · Exynos · Snapdragon · MediaTek
Machine Learning Intern at Arm
Jun 2025 - Aug 2025 · 3 mos
💡Technical Contributions: 1) Researched and benchmarked Vector Quantization schemes for group-wise quantized LLMs, evaluating 1D through 4D codebook approaches (GPTVQ, QuIP#, QTIP) on Llama 2-7B and Llama 3-8B to achieve sub-4-bit compression with minimal accuracy degradation. 2) Developed hardware-efficient VQ techniques for LLM inference optimization on Arm's AI acceleration hardware, investigating the tradeoffs between codebook dimensionality, quantization granularity, and model accuracy. 3) Implemented kernel fusion for quantized LLM inference on Arm accelerators, reducing memory overhead during token generation. ➡️ Skills/Technologies: Vector Quantization · Post-Training Quantization · LLM Inference Optimization · Kernel Fusion · Model Compression · Arm AI Hardware · PyTorch · C++ · Perplexity Benchmarking
Samsung India (2 yrs 8 mos)
- Senior Machine Learning Engineer
  Feb 2024 - Aug 2024 · 7 mos
  𝗩𝗶𝘀𝘂𝗮𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗧𝗲𝗮𝗺 💡Technical Contributions: 1)Engaged with Samsung’s AI camera division, contributing to research and innovation. Responsible for developing a comprehensive solution pipeline, covering data management to deploying AI models on devices. Involved in patenting and conducting research exploration. 2)Onsite project at Samsung HQ ( Seoul, South Korea ): Developing and enhancing Machine learning solutions to analyze and comprehend photo and video content within the Single-Take Photo application on Galaxy S24 flagship mobile devices. Responsible for data collection, model architecture design, model conversion, quantization, and porting onto mobile devices using SNAP. 3)Daily responsibilities include maintenance and enhancement of AI models across various Samsung device generations, ensuring, consistent optimal performance. ➡️ Skills/Technologies the role demands: Deep Neural networks, 2D/3D Computer Vision, Object Oriented programming, Mobile platform Optimization, Android Studio, Camera System Integration, Neural Network Architecture Design, Software Development Life Cycle (SDLC), Performance Tuning, Model Deployment and Inference
- Machine Learning Engineer
  Jul 2022 - Mar 2024 · 1 yr 9 mos
- Research And Development Intern
  Jan 2022 - Jul 2022 · 7 mos
  𝗩𝗶𝘀𝘂𝗮𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗧𝗲𝗮𝗺 💡Technical Contributions: 1)Engineered a model invariant edge-cloud collaborative inference technique to notably cut DNN delay by dynamically adjusting the split point between server and mobile devices, factoring in server loads, network bandwidth, and hardware configuration. 2)Contributed actively towards advancing mobile side DNN inference by effectively tackling the problem of low memory and limited power backup for mobile devices facilitating efficient collaborative intelligence between cloud and Samsung mobile platforms. ➡️ Skills/Technologies the role demanded: Deep Neural Networks, Computer Vision, Socket programming, Benchmarking, analyzing different algorithms, Benchmarking, System Resource Profiling.