Post by Ritikesh Choube

Senior AI Engineer | LangGraph · CrewAI · Google ADK · MCP | Multi-Agent Systems & RAG in Production | Tech Lead

Ever wondered how to fine-tune LLaMA 2 (70B parameters) or any other model on just 15GB of Google Colab? My latest blog breaks it down with a focus on Quantization techniques, diving into the math behind exponent, mantissa, and bit representation. How does quantization reduces model size while maintaining performance? Check it out.. #LLM #Fine_tuning #Ai