Post by Fatih Şahin ISTQB

Senior QA Automation Engineer | SDET | Java · Selenium · JavaScript · Playwright · REST Assured API · CI/CD | Manual Testing | Scalable Test Automation Frameworks

DEFECT 14 Filters Crashed, Result: 504 Gateway Timeout! Getir Imagine you are starving, you’ve successfully selected your delivery address, and all you want is to filter the cuisines to order your food... Instead, you get hit by a frustrating "Try again later" popup, followed by a cold 504 Gateway Timeout wall. 📉 I experienced this exact UX crisis on getir.com/yemek. As a QA Engineer & Bug Hunter, I put my hunger aside and analyzed the failure. Since the address selection worked fine, the frontend wasn't the issue—the backend microservices had completely bottlenecked. 🔍 Potential Root Causes (RCA) Cascading Failure in Filtering Service: Once the location is set, fetching and filtering specific restaurants for that zone under peak-hour traffic likely overwhelmed the filtering microservice, causing request queues to back up. Database Connection Pool Exhaustion: Non-optimized geospatial filtering queries can easily drain the database connection pool within seconds, forcing the API Gateway to time out. Cache Invalidation Storm: If the Redis/Cache layer dropped or invalidated during peak hours, the massive filtering load hit the primary database directly, paralyzing the system. 🛠️ The Solution: Circuit Breaker & Fallback UX Instead of breaking the entire user journey when the filtering service fails, the architecture should implement the Circuit Breaker Pattern. Instead of throwing a 504 error, the system could gracefully degrade and serve a cached, static list of popular restaurants for that address without the filtering functionality. QA Takeaway: QA isn't just about verifying that code works; it’s about designing fallback scenarios to prevent user churn during a crisis. Because for a hungry user, switching to a competitor takes exactly 3 seconds. ⏰ Check out the screen recording attached to see the breakdown. Have you ever faced a high-stakes system crash while starving? Let’s discuss in the comments! #SoftwareQA #BugHunting #QualityEngineering #UserExperience #Microservices #BackendTesting 🍽️ Teslim Noktası Seçildi, Filtreler Çöktü, Sonuç: 504 Gateway Timeout! Karnınız deli gibi acıkmış, teslimat adresinizi seçmişsiniz ve o an tek istediğiniz mutfak filtresine basıp yemeğinize ulaşmak... Ama karşınıza çıkan şey: "Daha sonra tekrar dene" uyarısı ve hemen ardından gelen 504 Gateway Timeout duvarı. 📉 Geçen akşam getir.com/yemek sayfasında tam olarak bu UX krizini yaşadım. Bir QA Engineer & Bug Hunter olarak açlığımı unutup sisteme teknik bir büyüteçle baktım. Adres seçimi başarılı olduğuna göre sorun ön yüzde değil, tamamen arka plandaki mikroservis mimarisindeydi. 🔍 Muhtemel Kök Nedenler (RCA) Filtreleme Servisinde Cascade Çökme: Adres seçildikten sonra o bölgeye ait restoranları mutfağa veya puana göre listeleyen filtreleme mikroservisi, yoğun saat trafiğini (high traffic) kaldıramadı ve istek kuyruğunda boğuldu. Devamı yorumlarda 👇

Post content

Video Content