Principal Site Reliability Engineer at UiPath
Job Description
📋 Description
- Lead Incident Command for high-stakes incidents.
- Lead live site troubleshooting and diagnose complex issues.
- Deliver executive updates during incidents to leadership.
- Drive post-incident retrospectives and RCAs.
- Define and improve SLIs/SLOs; promote proactive monitoring.
- Design automation to reduce toil and improve reliability.
🎯 Requirements
- 7+ years in SRE/Cloud Ops with 3+ years leading incidents.
- Strong command presence under pressure.
- Forensics and investigation skills to root cause analysis.
- Proficiency in Python or Go; Kubernetes; Azure (preferred).
- Observability: Prometheus, Grafana, OpenTelemetry.
- On-call availability as Incident Commander.
🎁 Benefits
- Flexible work options: hybrid, remote, or on-site.
- Inclusive, diverse workplace with equal opportunities.
- Reasonable accommodations on request.
- Rolling application process with no fixed deadline.
More Current Jobs at UiPath
Apply to other open positions at UiPath
