Senior DevOps / Platform Reliability Engineer at Zingtree
Job Description
📋 Description
- Own and evolve CI/CD pipelines with GitHub Actions and OIDC for microservices and agent workloads.
- Automate infrastructure provisioning using Terraform and CloudFormation.
- Operate and scale Kubernetes (EKS + Argo CD): autoscaling, ingress, backups.
- Manage edge/network perimeter with Cloudflare, CloudFront, Route 53, API Gateway, ALB/NLB.
- Operate data/event tier: Aurora MySQL, ElastiCache/Redis, S3, MSK; backups and PITR.
- Build observability with Prometheus, Grafana, OpenTelemetry; track LLM/agent telemetry.
🎯 Requirements
- 5+ years of experience in DevOps, SRE, or Platform Engineering on AWS.
- Strong CI/CD experience with GitHub Actions, GitLab CI, Jenkins, or CircleCI.
- Hands-on experience operating production EKS environments, including autoscaling, ingress, secrets management, and cluster upgrades.
- Strong AWS networking experience, including multi-account VPC design, subnets, routing, security groups, NACLs, Route 53, ACM, and load balancers.
- Deep experience with Terraform and GitHub Actions, ideally using OIDC-based cloud authentication.
- Experience with Aurora/RDS MySQL, Redis (ElastiCache), and S3, including backups, PITR, migrations, and lifecycle management.
🎁 Benefits
- 100% of employee premiums covered
- 75%–80% of dependent premiums covered for health/dental/vision
- 401(k) plans; no employer matching currently
- Paid parental leave
- Unlimited PTO
- Flexible remote work; coworking reimbursement and home office funds
