hands-on experience with infrastructure-as-code tools and cloud automation
proven ability to troubleshoot complex infrastructure issues, perform root cause analysis, and implement system improvements
experience with monitoring and alerting systems like Prometheus, Grafana, Datadog, or equivalent
excellent communication and collaboration skills, with the ability to work cross-functionally and explain technical concepts to non-technical stakeholders
strong understanding of cloud platforms and modern infrastructure practices
7+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 5+ years experience leading a small team or mentoring junior engineers