Related Jobs
Related Jobs
Related Jobs

Share this Job
Site Reliability Engineer job at IBM (International Business Machines Corporation) | Apply Now
Remote, OR, USA
Full Time
Are you looking for Remote Software Engineering jobs in 2025 today? then you might be interested in Site Reliability Engineer job at IBM (International Business Machines Corporation)
About the Organisation
IBM is a globally renowned technology and consulting company, established in 1911. With a focus on hybrid cloud and AI, IBM offers cutting-edge solutions in software, infrastructure, and services. It is recognized as one of the largest and most innovative tech employers, serving Fortune 50 companies around the globe. IBM values diversity, continuous learning, and impactful innovation.
Job Title
Site Reliability Engineer job at IBM (International Business Machines Corporation)
IBM (International Business Machines Corporation)
Job Description
The Site Reliability Engineer II will work within the Infrastructure Services team to support IBM's cloud offerings powered by HashiCorp. The role includes automating processes, reducing manual toil, improving observability, and supporting production infrastructure for IBM cloud services. Candidates will work with tools such as Nomad, Consul, Vault, Terraform, and AWS. The position is remote and ideal for engineers looking to grow into senior roles in site reliability engineering.
Key responsibilities include:
Developing and maintaining infrastructure services to ensure high availability and security.
Implementing automation and improving deployment processes.
Debugging infrastructure issues with guidance from senior engineers.
Participating in on-call rotations post-onboarding.
Creating and maintaining documentation.
Collaborating across teams and engaging in hiring activities.
Duties, Roles and Responsibilities
Build, maintain, and improve core infrastructure systems.
Ensure system reliability, scalability, and security.
Automate operations to minimize manual tasks.
Improve monitoring, alerting, and logging.
Resolve infrastructure issues and support incident response.
Collaborate with product and engineering teams.
Write and maintain technical documentation.
Support interviews and hiring evaluations.
Qualifications, Education and Competencies
Required:
High School Diploma/GED (Bachelor's Degree preferred).
Experience in site reliability engineering or systems administration.
Familiarity with AWS and Terraform.
Exposure to observability tools (Datadog, Prometheus, Grafana).
Basic scripting skills (Python, Go, Bash).
Strong problem-solving and collaboration skills.
Preferred:
Growth mindset with eagerness to learn and take on increasing responsibilities.
Familiarity with IBM and HashiCorp products.
Interest in progressing into a senior SRE role.
How to Apply
ONLINE APPLICATION ONLY!
Interested candidates are advised that applications for this position must be submitted online. To apply please click the “Apply” button below.