Islāmābād, Pakistan
Hi, I'm Ahmad Mustafa Anis, a Machine Learning Engineer, working at Roll.ai, where I work in developing scalable computer vision and multimodal AI video editing platform. My professional experience combines deep technical expertise with active research collaboration, notably as a Research Collaborator at Data Provenance Initiative which started at MIT Media Lab and as an AI Research Fellow at Fatima Fellowship, advised by Dr. Wei Peng from Stanford University. My research interests focus on Vision-Language Models, Self-Supervised Learning, and Large Language Models (LLMs). I have publications at prestigious AI conferences such as NeurIPS and ICLR. I am passionate about mentoring and fostering vibrant AI communities, I currently lead community efforts at Cohere Labs Community (Formerly Cohere.for.ai) and mentor emerging AI talent through the MESA Initiative at Foothill College. Check out my insights and writings here: * KDnuggets: https://www.kdnuggets.com/author/ahmad-anis * cnvrg.io: https://cnvrg.io/author/ahmad-anis * Medium: https://medium.com/@ahmadanis5050 I’m always eager to connect, feel free to reach out if you're interested in Machine Learning, or looking for mentorship!
Worked on Retrieval Augmented Test Time Inference for Medical Vision Language Models, advised by Dr. Wei Peng (https://xiaoiker.github.io) from Stanford University
The projects I have worked/working on * BLUR sensitive information for streaming applications in real-time using OCR. * Illegal Bowling Angle Detection in Cricket using Deep Learning techniques such as Key Points estimation * House Blueprints prediction from plot dimensions * Modern Dashboards using Plotly and Dash * Prompt Engineering for aiphotos.ai * Trained a model to predict startup success rate based on multiple features. Random success rate(5%), our model success rate(40%) Other responsibilities include * Writing hands-on technical tutorials for the Red Buffer's Medium Channel. * Communicating the progress on different projects with clients.
- Selected as one of the 10 fellows out of 350+ applicants worldwide to be a part of the 2-month immersive School of AI program on a 100% scholarship - Used Long Context LLMs (MistralLite, Mixtral 8x7B) to extract essential information from long transcripts. - Experimentation with advanced RAG tools such as DSPy with local LLMs for Information Retrieval.
Projects I have worked on during my tenure at WortelAI: * Safety Detection system for construction sites. Used object detection techniques to detect whether a person is wearing a safety hat and safety vest or not. * Neural Search Engine using Qdrant, CLIP by OpenAI, and Elastic Search. * Multi-Label Multi-Class Classifier on a big diverse dataset scraped from Reddit. * Data visualization using QGIS and PostGIS of panoramic Image segmentation. * Celebrity Detection and Classification using 3DDFAv2, SORT, ResNet50, on a 172GB dataset (https://github.com/prateekmehta59/Celebrity-Face-Recognition-Dataset)
Using state of the art Computer Vision technologies to make scaleable real world applications