Saltar al contenido
Introducción

A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe. You'll work with visionaries across multiple industries to improve the hybrid cloud, data and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio. Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, you'll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience.

Su función y responsabilidades

We are seeking a highly skilled and motivated Site Reliability Architect with foundational AI knowledge to join our growing Application Operations team. In this role, you will focus on ensuring system reliability, automating operational processes, and leveraging emerging AI tools to enhance operational efficiency. You will be responsible for implementing SRE best practices, building robust automation solutions, and troubleshooting complex system issues while collaborating with cross-functional teams to maintain highly available and scalable systems. You will work at the intersection of traditional operations engineering, automation, and modern AI-enhanced tooling to build resilient systems that deliver exceptional reliability and performance.

Key Responsibilities:

• Implement and maintain SRE practices including SLIs, SLOs, error budgets, and reliability monitoring

• Design, build, and maintain automation solutions for deployment, monitoring, incident response, and system maintenance

• Troubleshoot complex system issues across distributed environments and implement sustainable solutions

• Develop and maintain observability solutions including monitoring, alerting, logging, and tracing systems

• Automate toil reduction through scripting, infrastructure as code, and process improvements

• Collaborate with development teams to improve system reliability through design reviews and reliability engineering practices

• Participate in on-call rotations and lead incident response efforts, including post-incident reviews and improvement implementation

• Build and maintain CI/CD pipelines and deployment automation tools

• Leverage AI-enhanced tools and basic machine learning concepts to improve operational insights and automate routine tasks

• Implement capacity planning and performance optimization strategies

• Maintain and improve system security, compliance, and operational governance

• Analyze system performance data and operational metrics to identify trends and improvement opportunities

• Stay current with emerging trends in SRE practices, automation tools, and AI-enhanced operational capabilities

• Ensure systems are designed for scalability, reliability, and maintainability

• Collaborate with operations teams to integrate reliability practices into existing operational workflows

 

Educación requerida
Licenciatura
Educación preferida
Licenciatura
Experiencia profesional y técnica requerida

• 3+ years of experience in a Site Reliability Architect role, DevOps, or similar operational roles

• Strong problem-solving and analytical troubleshooting skills across complex distributed systems

• Experience with observability platforms (Splunk, Dynatrace, New Relic, DataDog)

• Hands-on experience building automation solutions using scripting languages (Python, Bash, Go, or similar)

• Experience with SRE principles including observability, monitoring, incident management, and reliability practices

• Proficiency with infrastructure as code and configuration management tools (Terraform, Ansible, CloudFormation)

• Experience with containerization technologies (Docker, Kubernetes)

• Knowledge of CI/CD pipelines and deployment automation

• Familiarity with Agile development methodologies

• Basic understanding of AI/ML concepts and interest in leveraging AI tools for operational improvements

• Experience with monitoring and observability platforms (Prometheus, Grafana, ELK stack, or similar)

• Strong understanding of Linux/Unix systems administration

• Experience with cloud platforms (AWS, Azure, GCP) and cloud-native technologies

• Knowledge of networking, security, and system performance optimization

• Experience working with large-scale, distributed systems



Experiencia técnica y profesional preferida

• Experience with Ansible, Red Hat OpenShift, Kubernetes orchestration and management

• Knowledge of incident management platforms and ITSM tools (ServiceNow, PagerDuty, Jira Service Management)

• Experience with database administration and performance tuning

• Familiarity with GitOps practices and tools (ArgoCD, Flux)

• Experience with chaos engineering and reliability testing practices

• Understanding of microservices architecture and service mesh technologies

• Experience with performance testing and capacity planning tools

• Excellent problem-solving and communication skills

• Desire to grow skills and work in a continuous learning environment

• Interest in exploring AI/ML applications for operational use cases

 

Acerca de la Unidad de Negocios

IBM Consulting es la unidad de negocio encargada de la consultoría y servicios profesionales globales de IBM, con capacidades líderes en el mercado en transformación empresarial y tecnológica. Con una gran experiencia en diversas industrias, ofreciendo servicios de estrategia, experiencia, tecnología y operaciones a muchas de las empresas más innovadoras y valiosas del mundo. IBMers en Cosnulting se enfocan en acelerar los negocios de nuestros clientes a través del poder de la colaboración. Creemos en el poder de la tecnología utilizada de manera responsable para ayudar a las personas, los socios y el planeta.

SU VIDA @ IBM

In a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.

Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.

 

Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.

 

Are you ready to be an IBMer?

Acerca de IBM

IBM’s greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.

 

Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we’re also one of the biggest technology and consulting employers, with many of the Fortune 500 companies relying on the IBM Cloud to run their business. 

 

At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it’s time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.

IBM is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, genetics, pregnancy, disability, neurodivergence, age, or other characteristics protected by the applicable law. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.

Otros detalles relevantes del empleo

Must have the ability to work in Canada without sponsorship. For additional information about location requirements, please discuss with the recruiter following submission of your application.