
Alexandr Ilin
Aleksandr leads chaos engineering and is part of the reliability team at Yandex Go, where he has shaped the practices that keep large-scale, business-critical services dependable. With over a decade of experience in IT, he believes that reliability is not just about systems but also about building trust in technology and across the teams and people who create these systems.
How To Use Criticality Tiers for Great Good
Every outage feels like a crisis. Managers see KPIs at risk, engineers scramble to fix the issue, and DevOps respond to alerts, even when the part of the system causing the outage has truly little impact on the business. For a manager, KPIs are at risk. For DevOps, alerts go off at 3 a.m. For developers, bug reports start flooding in. But in most companies, nobody can clearly say how critical that service really is to the business. Aleksandr will show how adopting criticality tiers provides a simple, structured way to prioritize, build the right SLAs and put effort where it matters most.