Data Center Operations System Engineer at Lambda
Job Description
📋 Description
- Rack, label, cabling and configure new server/storage/network infra.
- Troubleshoot hardware/software in GPU and networking systems.
- Document/update data center layout and topology in DCIM.
- Coordinate with supply chain/manufacturing for deployments.
- Manage parts depot inventory across data centers.
- Collaborate with HW Support to resolve hardware incidents and share fixes.
🎯 Requirements
- Experience with critical data-center infra: power, airflow, env monitoring, DCIM.
- DIA circuits, fiber testing and troubleshooting.
- Linux administration expertise.
- NVIDIA NVL72 HPC GPU experience.
- Familiar with Infiniband networks (400G).
- JIRA/Zendesk ticketing experience.
🎁 Benefits
- Generous cash & equity compensation.
- Health, dental, and vision coverage for you and dependents.
- Wellness and commuter stipends for select roles.
- 401k plan with 2% company match (USA employees).
- Flexible paid time off plan that we all actually use.
