Mant, Uttar Pradesh, India
I build production-grade AI systems focused on Agentic AI, RAG pipelines, LLM infrastructure, and scalable GenAI applications. My work revolves around designing intelligent systems that combine: • Large Language Models (LLMs) • Retrieval-Augmented Generation (RAG) • Vector Search & Embeddings • AI Agents & Tool Calling • Rust + Python backend systems • AI inference pipelines • Semantic search architectures I actively work with technologies and ecosystems including: • Rust • Python • LangChain • Qdrant / pgvector • FastAPI / Axum • ONNX / Ollama • WrenAI • MindsDB • Hugging Face • LLM APIs & local inference Currently exploring: → Agentic workflows → Multi-agent systems → AI search infrastructure → Low-latency inference systems → Streaming AI architectures → Scalable RAG pipelines I’m especially interested in building AI systems that are: • Fast • Reliable • Production-ready • Memory efficient • Scalable at high concurrency My focus is not just using AI APIs — but engineering the infrastructure and systems behind modern AI products. Open to collaborations, AI engineering opportunities, and building impactful GenAI products.