Post by Vinicius Castro de Souza

Spanish (Advanced) | ITIL 4 | Site Reliability Engineering (SRE) | Incident & Alert Management | IT Infrastructure Monitoring | Front-end Development (HTML, CSS, JavaScript, React) | Git | GitHub | MySQL | Python

Leveraging my day off to build a Zabbix lab! I took some time today to dive deep into the Zabbix ecosystem. The goal was to build an environment from scratch and understand the end-to-end flow: 1️⃣ CLI Configuration: Fine-tuning the Zabbix Agent directly on Linux. It’s rewarding to understand what happens under the hood to ensure efficient data collection. 2️⃣ Alert Intelligence: Configuring "Disaster" severity triggers. In the lab, I was able to detect an Apache service failure in real-time, catching the issue before any end-user impact. 3️⃣ Notification Channels: Structured alerts via Telegram for immediate response and Email for official logging. Getting the right information through the right channel is what actually reduces MTTR. 4️⃣ Resolution Cycle: Following the flow from the initial problem to the "Resolved" status helps validate how a solid monitoring strategy saves the team time. Next steps: Integrating this lab with Grafana dashboards and exploring automated ticket creation in GLPI. What about you? Which technology have you been exploring in your free time? #Zabbix #Observability #NOC #Monitoring #Linux #ITInfrastructure #DevOps #LifelongLearning #IT #TechLead Português: Aproveitando a folga para estruturar um laboratório de Zabbix 🚀 Hoje tirei o dia para mergulhar no ecossistema Zabbix. O objetivo foi montar um ambiente do zero e entender o fluxo de ponta a ponta: 1️⃣ Configuração via CLI: Ajuste fino do Zabbix Agent diretamente no Linux. É gratificante entender os bastidores da coleta de dados para garantir uma monitoração eficiente. 2️⃣ Inteligência de Alerta: Configuração de gatilhos de severidade "Disaster". No laboratório, consegui detectar em tempo real a queda de um serviço Apache, antecipando o problema antes do impacto final. 3️⃣ Canais de Notificação: Estruturei alertas via Telegram para resposta imediata e E-mail para registro oficial. Ter a informação certa no canal certo é o que reduz o MTTR na prática. 4️⃣ Ciclo de Resolução: Acompanhar o fluxo desde o surgimento do problema até o status "Resolved" ajuda a validar como uma boa estratégia de monitoramento salva tempo da equipe. Próximos passos: Integrar esse laboratório com dashboards no Grafana e explorar a abertura automática de chamados no GLPI. E por aí, qual tecnologia você tem explorado no seu tempo livre?