Staff Site Reliability & DevOps Engineer at Brandwatch
Job Description
📋 Description
- Design, operate, and evolve observability platforms using Grafana and Prometheus
- Define dashboards, alerts, metrics standards, and SLOs
- Reduce alert noise; tune thresholds and runbooks
- Support incident response with actionable telemetry and post-incident analysis
- Instrument services and integrate metrics, logs, traces across distributed systems
- Automate observability configuration with infrastructure as code
🎯 Requirements
- Strong experience with Prometheus (scraping, federation, recording rules, alerting)
- Strong experience with Grafana (dashboards, alerting, templating, RBAC)
- Linux and networking fundamentals
- Experience running observability stacks in Kubernetes environments
- Infrastructure as code experience (Terraform preferred)
- Familiarity with incident management and on-call practices
🎁 Benefits
- Inclusive, diverse workplace and belonging
- Global team collaboration across regions
- Growth-focused culture and innovation
- Work with award-winning PR and analytics tech
- Commitment to accessibility and accommodations
- Equal opportunity employer
More Current Jobs at Brandwatch
Apply to other open positions at Brandwatch
