Greater Istanbul
Safe by Default
Learn by Interact
MIS-203 Software Development MIS-403 Fundamental of Artificial Intelligence
- Developed a language-agnostic auto-evaluation package to assess capabilities during GPT-based large language model (LLM) training. -Enhanced the fine-tuning of large language models by integrating model and data parallelism, utilizing mixed precision techniques to optimize performance and memory usage. -Increased fine-tuning efficiency through advanced context expansion methods and distributed parameter-efficient training strategies, employing mixed precision for improved computational efficiency. -Implemented the DPO (Direct Preference Optimization) technique for aligning LLMs within a torch.distributed training environment, utilizing a multi-node GPU setup to accelerate the training process across multiple devices. -Designed a custom GPT architecture compatible with multiple ML libraries, including Hugging Face and vLLM, optimized for various accelerators, and enhanced using mixed precision techniques. -Executed model merging techniques to combine and optimize LLMs, improving performance and resource efficiency while maintaining mixed precision capabilities. - Developed solutions for spell correction, information retrieval, query category prediction, and app category prediction for Huawei AppGallery NLP.