A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.
Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.
IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
Your primary responsibilities include:
Infrastructure & Cloud Management:
• Design, build, and manage scalable cloud infrastructure using IBM Cloud, AWS, GCP, Azure.• Implement Infrastructure as Code using Terraform.• Deploy and configure applications using container orchestration platforms like Kubernetes/OpenShift.
Automation & CI/CD:
System Monitoring & Reliability:
• Monitor health and performance of production systems (24x7 observability).• Use tools like Instana, Grafana/Prometheus, and New Relic to build alerts and dashboards.• Troubleshoot and resolve production issues in collaboration with engineering and support teams.
Security & Compliance:
• Perform regular patching, upgrades, and collaborate with product support to resolve issues.
Database & Middleware:
• Manage open-source middleware and databases such as PostgreSQL, CouchDB, Redis, Kafka, and Spark.• Participate in incident response and on-call rotations.Required Technical and Professional Expertise:
- Strong working knowledge of Kubernetes and cloud infrastructures, with a preference for AWS.
- Proven experience in providing on-call support for critical production systems, with a focus on determining root cause analysis (RCA).
- Proficiency in scripting languages like Python and related tools.
- Strong problem-solving skills and attention to detail.
- Expertise in automation platforms such as AWX.
Familiarity with Salesforce infrastructure and case management processes.
- Experience with monitoring tools and incident management platforms.
- Ability to work efficiently in a global, distributed team environment.