Mountain View, California, United States
Ads Attribution
• Developed a document analysis system by fine-tuning LLMs and using RAG, achieving 95% accuracy • Performed research in using Diffusion models to guide Gaussian Splatting renderer for novel view synthesis • Published paper, ”ERUPT: Efficient Rendering with Unposed Patch Transformer” in the Conference on Computer Vision and Pattern Recognition (CVPR 2025)
• Implemented a Transformer-based model with MLflow to render large scenes for few-shot view synthesis • Performed data collection with MongoDB and PyTorch to gather a large scale dataset with 8 million images • Optimized data collection pipeline in Docker containers between various servers
• Published paper, ”Learning to Rank Visual Stories with Human Ranking Data” in the Association for Computation Linguistics (ACL 2022) • Published paper, ”Multi-VQG: Generating Engaging Questions for Multiple Images” in the Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) • Created a novel and generalizable SOTA metric for visual storytelling with PyTorch • Designed UI with HTML, CSS, and JavaScript to collect multi-image question dataset with Amazon Mechanical Turk
NTHU AHG Lab: - Research in Automatic Speech Recognition(ASR) - Collected Taiwanese Accent Chinese Corpus for adaptation - Created an automatic subtitle system for online courseware