DevOps / QE Engineer
Location: Remote (Worldwide)
Employment Type: Full-Time
Company: The Lustre Collective
About The Lustre Collective
The Lustre Collective is dedicated to advancing the Lustre file system, the leading open-source, parallel distributed file system designed for large-scale cluster computing. With deep expertise in scalable storage solutions, our team supports cutting-edge AI and HPC environments, on-premises and in the cloud. We are committed to vendor-neutral development, drawing inspiration from communities like OpenSFS and EOFS to drive innovation in Lustre technology.
As a tight-knit company with a fun, collaborative culture, we offer the unique opportunity to work alongside some of the smartest and most qualified file system and kernel engineers in the world. Our team thrives on innovation, open-source passion, and a supportive environment where ideas flow freely and work-life balance is prioritized. If you're excited about contributing to groundbreaking storage solutions in a dynamic, global team, we'd love to hear from you!
Job Overview
We are seeking a skilled DevOps Engineer to join our remote team. This role focuses on building and maintaining team infrastructure, automation, and CI/CD pipelines to support the development and deployment of the Lustre file system in HPC and AI environments.
Key Responsibilities
- Design, implement, and maintain CI/CD pipelines using tools like Jenkins, Git, and automation scripts.
- Automate infrastructure provisioning and configuration management with tools like Terraform and Ansible.
- Manage containerization and orchestration with Docker and Kubernetes for scalable environments.
- Troubleshoot infrastructure issues, ensuring high availability and performance for team workflows.
- Collaborate on integrating DevOps practices into Lustre development and open-source contributions.
- Monitor and optimize cloud and on-premises resources for efficiency.
- Participate in code reviews and team discussions to promote best practices.
Required Qualifications and Skills
- Strong proficiency in Python, with experience in C and Rust a plus.
- Expertise in DevOps tools and practices, including CI/CD, IaC, and containerization.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and HPC infrastructure.
- Proficiency in Linux/Unix administration, networking, and security.
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
- Excellent problem-solving and collaboration skills for a remote team.
Preferred Qualifications
- Experience with storage systems infrastructure, such as Lustre or similar.
- Contributions to open-source DevOps projects.
- Knowledge of monitoring tools like Prometheus and Grafana.
- Background in performance optimization for distributed systems.
What We Offer
- A fully remote position with flexible hours to accommodate global time zones.
- Competitive salary and benefits package available
- The chance to work in a fun, tight-knit culture where creativity and humor are encouraged—think virtual team-building events, hackathons, and a supportive community of experts.
- Opportunities for growth, including conferences, training, and leadership in open-source initiatives.
- A commitment to diversity, equity, and inclusion, welcoming candidates from all backgrounds worldwide.
If you're a passionate engineer ready to advance the future of Lustre and collaborate with world-class talent, apply today!
Contact Us to Apply