Passer au contenu
Introduction

Working in IBM Cloud gives you the platform to learn, develop and utilize your skills everyday by working on the latest cloud related technology products and services. You'll be working in an environment where we understand how we can thrive best when we play to our strengths. That's why developing our people is key to our success, the door is always open for those ready to advance their career.

Curiosity and courageous thinking are both vital when working in IBM Cloud, as we continue our dedication in guaranteeing that we are at the forefront of cloud technology. Our renowned legacy means we are leading the way in everything from analytics and security through to unmatched hardware & software designs. We provide our clients with the full end-to-end transformation as we build IBM's next generation cloud platform which is focused around delivering performance and predictability at a global scale.

IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

 

At IBM, we are driven to shift our technology to an as-a-service model and to help our clients transform themselves to take full advantage of the cloud. With industry leadership in AI, analytics, security, commerce, and quantum computing and with unmatched hardware and software design and industrial research capabilities, no other company is as well positioned to address the full opportunity of enterprise cloud computing. We are looking for a Site Reliability Engineer to join our IBM Cloud VPC Observability team. This team is dedicated to ensuring that IBM Cloud is at the forefront of reliable enterprise cloud technology. We are building Observability platforms to deliver performance, reliability and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. If you are someone who wants to know what it takes to build a scalable, secure and reliable service and want to grow your technical depth in all aspects of SRE including security,monitoring, automation, development, infrastructure, self-healing, troubleshooting and are a go-getter with an ownership mindset, we may be the right team for you.

Votre rôle et vos responsabilités

As a Site Reliability Engineer, you will play a crucial role in supporting, maintaining, and operationally improving the cloud infrastructure. Working closely with various teams, your focus will be on ensuring the health and reliability of production and test systems. Your proactive approach will be essential in responding promptly to issues and alerts, contributing to the development of new capabilities, and collaborating with other SRE teams and program managers to deliver mission-critical services to the market.

 

Key Duties:

  • 24x7 System Monitoring:

 Monitor the health of production and test systems around the clock, ensuring continuous reliability.

  • Rapid Issue Response:

 Respond promptly to production issues and alerts, providing swift resolution and maintaining system availability.

  • Capability Development:

 Support the development of new and existing capabilities for compute, storage, and network services.

  • Collaborative Partnership:

 Partner with other SRE teams and program managers, contributing to the seamless delivery of mission-critical services to the market.

  • Automation Execution:

 Execute changes in the production environment through automation, ensuring efficiency and minimizing downtime.

  • Cross-Functional Troubleshooting:

 Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively.

  • Integration Planning:

 Work with support and development teams to identify and resolve issues. Discuss and plan integration tasks to enhance overall system performance.

 

  • Implement and administrate infrastructure and solutions that support our Observability team
  • Work in a Kubernetes based micro services environment to support our leading edge cloud services. This will include custom solutions, as well as open source DevOps tools (build and deploy automation, monitoring and data gathering for our software delivery pipeline)
  • Contribute to our continuous improvement and continuous delivery while increasing maturity of DevOps and our agile adoption practices
  • Support the compliance and security integrity of the environment through your work
  • Partner with other teams, functional managers and program managers to deliver mission-critical services
  • Support development of new and enhanced pipeline capabilities
  • Adopt and build on automation solutions governed by SRE principles including CI CD pipelines, configuration management, immutable infrastructure deployment, auto healing systems etc.
  • Provide technical escalation support
  • Conceptualize, Design, implement, manage and create a reliable, highly performant, scalable automation solutions that can build consistency across our infrastructure
  • Work with and adopt open source technologies as well as participate in new IBM innovations across IaaS
  •  
  • A self-driven attitude to propose, test and implement solutions and improvements for review and consideration with your peer
Formation requise
Licence
Expertise professionnelle et technique requise
  • Over 5 years of hands-on experience with programming languages such as Go, Python,Bash Scripting
  • Familiarity with using Jenkins / Tekton for CI and ArgoCD / Jenkins for CD
  • Knowledge of security tools, including static code and dynamic code analysis,vulnerability scanners, and intrusion detection/prevention systems.
  • Experience on Containers and Container Orchestration tools such as Docker and Kubernetes.
  • Delivering micro services reliable at scale with horizontal pod scaling etc.
  • Familiarity with cloud platforms and their orchestration using IAC tools like Terraform and Ansible and automated deployment using Helm
  • Container performance and security
  • Familiarity with automation using scripting languages like Python
  • 5+ years working with designing, developing and deploying software with Cloud technologies like AWS, Azure, IBM Cloud or GCP.
  • Understanding of secure principles
  • Understanding of version control systems like Git and artifact management tools such as JFrog Artifactory.
  • 5+ years experience with Monitoring technologies: Sydig, Grafana, ELK, Mimir, Zabbix etc.
  • Release Engineering (Git Branching, versioning, tagging)
  • Experience with Agile software development
Expertise professionnelle et technique préférée
  • Preferred Professional and Technical Expertise
  • Familiarity with Open Telemetry concepts, Tracing, Metrics, Events and other Observability principles
  • Familiarity with using Grafana stack like Mimir, Grafana Alloy agent etc.
  • CI CD implementation experience using Tekton and ArgoCD
  • Expertise with end to end infrastructure automation using Python, Terraform, Ansible
  • Familiar with adopting secure practices and processes
  • Familiar with Linux systems and troubleshooting on them

À propos de la Business Unit

IBM Systems aide les responsables IT à poser un regard différent sur leur infrastructure. Les serveurs et le stockage IBM ne sont plus des objets inanimés : ils peuvent comprendre, raisonner et apprendre, permettant ainsi à nos clients d'innover tout en évitant les problèmes IT. Nos systèmes sont les moteurs des industries les plus importantes au monde et nos clients sont les architectes du futur. Rejoignez-nous pour participer à la construction de notre portefeuille technologique de pointe, conçu pour l'entreprise cognitive et optimisé pour le cloud computing.

VOTRE VIE CHEZ IBM

In a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.

Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.

 

Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.

 

Are you ready to be an IBMer?

À propos d'IBM

IBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.

 

Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we’re also one of the biggest technology and consulting employers, with many of the Fortune 50 companies relying on the IBM Cloud to run their business. 

 

At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it’s time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.

Autres détails pertinents sur le poste

When applying to jobs of your interest, we recommend that you do so for those that match your experience and expertise. Our recruiters advise that you apply to not more than 3 roles in a year for the best candidate experience. For additional information about location requirements, please discuss with the recruiter following submission of your application.