Aleksa Gordić

pretraining LLMs | x-Google DeepMind | Angel

San Francisco, California, United States

About

Cooking. Social: 📄 Website - https://www.aleksagordic.com/ (old one: https://gordicaleksa.com/) 📺 YouTube - https://www.youtube.com/c/TheAiEpiphany 🐦 Twitter - https://twitter.com/gordic_aleksa 👨‍👩‍👧‍👦 Discord - https://discord.gg/peBrCpheKE 📚 Medium - https://gordicaleksa.medium.com/ 💻 GitHub - https://github.com/gordicaleksa 📢 AI Newsletter - https://aiepiphany.substack.com/ 💰 Patreon - https://www.patreon.com/theaiepiphany NOTE: I get a lot of DMs and so If I don't answer it's because I, unfortunately, can't possibly answer every DM I receive. --- about me in O(1) --- ex-software/ML engineer at Microsoft & DeepMind with a broad background: electronics & embedded programming on uCs (bachelor) , software engineering, algorithms, deep learning (computer vision, natural language processing (NLP), geometric DL, reinforcement learning (RL)...), web, mobile, Chrome extensions, etc (I had a lot of personal projects :)). I try to spend every waking hour learning, improving, creating, pushing myself out of my comfort zone, and sharing the stuff I learn in public. I'm constantly revising and improving my productivity, health & learning processes (at the same time making sure it's not procrastination in disguise). Before I started working for Microsoft I got my bachelor's in EE and I did 2 internships: 1) In a German startup - Telocate - where I was working as an Android dev (autodidact) developing precise localization using ultrasound. 2) In Brazil (UFOP university) where I worked on researching various heuristics for the multi-pile vehicle routing problem. Ended up more of a life experience: I lived with 11 Brazilians and learned Portuguese. At Microsoft, I got to work directly with talented engineers & researchers from Microsoft Research Cambridge, and from Seattle on the HoloLens family of devices. While I was working at MSFT I was also part of the organization and a lecturer at PSI:ML (ML camp). I worked as a research engineer at Google DeepMind on visual language models (VLMs) and multimodal learning in general, where I led a bulk inference project that brought VLMs (Flamingo) to various Google/YouTube products. Since I left DeepMind I tried a few things, failed many times, still cooking though. Some non-tech hobbies of mine: * Learning (human) languages. I learned 5 in my free time (mostly in my high-school days, it gets really easy once you learn the first two) * Powerlifting/calisthenics (my max was 120 kg bench press, 5 sets 3 reps ;)).

Experience

  • Research Scientist / Tech Lead at Essential AI
    Sep 2025 - Present · 10 mos

    Tech lead on the STEM {pre,mid,post}-training team working directly with Ashish Vaswani (1st author of the Transformer paper) and a small group of cracked technical people on shipping SOTA open-source LLMs with a focus on code, math, and stem. We released the best USA open-source LLM (dense transformer) in the 8B category - fully {pre,mid,post}-trained from scratch on zettaflops of AMD and TPU compute: Rnj-1: https://essential.ai/research/rnj-1 We released a follow-up long context (32k -> 160k) model, Rnj 1.5: https://huggingface.co/EssentialAI/rnj-1.5-instruct I led and co-led end-to-end efforts across data (including building an internal PDF OCR pipeline, and data mixing), long context, training experiments, evaluations, and infra/compute efficiency.

  • Investor / Angel at AI startups
    Sep 2023 - Present · 2 yrs 10 mos

    If you are looking for an angel and you are building a consequential company, dm me. :) I care about AI -> model layer, (ML) infra, AI accelerators, energy + robotics / physical AI. bonus: you are based in SF / bay area.

  • Co-Founder and Head of AI at P-1 AI at P-1 AI
    Sep 2024 - Jul 2025 · 11 mos

    I previously co-founded the company with an ex CTO of Airbus. The goal was to build an AI system that would help us design physical systems in the world. We raised $23M from Radical Ventures and few great angels (Google's Jeff Dean and Zak Stone, OpenAI's Peter Welinder, etc.).

  • Ran some startups, built some projects like llm.c at Startups
    Apr 2023 - Aug 2024 · 1 yr 5 mos

    Built Ortus AI (a YouTube assistant), replicated Meta's NLLB in open (for low resource Balkan languages), founded Runa AI (trained SOTA LLMs (large language models) for low-resource languages and offered to enterprise & gov in the Balkan region), built Jarvis for Images, Cracked Engineers, and llm.c together with Andrej Karpathy, Arun, and Erik. Learned a lot from some failures. :)

  • Research Engineer at DeepMind
    Dec 2021 - Mar 2023 · 1 yr 4 mos

    Led the integration of vision language models (VLMs, Flamingo 🦩) into Google’s products (YouTube ads, YouTube shorts, internal demos, etc.) through the development of a bulk inference tool. This tool facilitated the deployment of Flamingo across a massive number of TPUs. Contributed to the Flamingo project and paper by assisting with data integration, addressing memory issues, and providing critical feedback (unfortunately joined the project at a later stage so only got an acknowledgement in the paper as opposed to a co-author role). Through my external education work in the AI space I was responsible for hiring 15-20 DeepMinders (who explicitly reached out to me when they joined to thank me). I was never internally awarded for this but is something I am very proud of. :) Flamingo: https://arxiv.org/abs/2204.14198