At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk!
The IBM Cloud Databases (ICD) team is responsible for developing, running, and operating the Software as a Service (SaaS) offerings that provide Database as a Services (DBaaS) in IBM Cloud.
ICD is seeking Database Expert that is knowledgeable in internal concepts, such as database clustering, replication, and performance. The expert will work within our site reliability engineering organization to debug, troubleshoot, and resolve complex database clustering, performance, and replication issues. The expert will also work with clients to understand application usage patterns and share their expertise in application design and data model/query/connection optimization to get the maximum reliability and performance from their database. The expert will also work with our development team to design and provide ongoing feedback as we build new capabilities for our service. The expert will contribute to runbooks, which serve as guidelines for on-call engineers, as well as other external documentation for customers and Cloud support team.
Coding and Programming skills is also required for this role. This is a good role for someone who might possess foundational programming skills and would like to hone those skills by contributing to the core service, tooling leveraged by our team to diagnose and respond to operational alerts, monitoring, metrics, and test cases.
Candidates should have a strong desire to work within a CICD environment and have a passion for embracing new cloud technologies. You need to be collaborative, able to handle responsibility, and love learning new techniques and tools. There is a requirement to be expert in the database technology and other cloud technologies such as Kubernetes.
We are a "You build it, You run it" culture. As a database expert, you will join our follow-the-sun rotation where you will be the primary responder for automated system alerts. You will follow runbooks to resolve such issues and use your troubleshooting and analytical skills to diagnose or troubleshoot platform or Data Service issues.
Demonstrated focused, hands-on experience with at least one of the following: PostgreSQL or MongoDB
Proven experience designing, building, and maintaining complex, mission-critical production database systems
Expertise in:
Configuring and operating highly available database clusters
Debugging and resolving issues related to replication, clustering, and performance
Application performance tuning and database integration
Troubleshooting application-side database issues
Understanding of relational database internals (e.g., locking, consistency, serialization, recovery)
Experience with Kubernetes in production environments
Strong Linux systems engineering background: performance tuning, memory/I/O optimization, networking, and security
Proficient in scripting and procedural coding (e.g., SQL, Python, Bash)
Demonstrated ability to automate operational tasks
Demonstrated experience across the core skills listed above
2+ years of programming in Python, Go, or a similar language
Experience operating large-scale database services
Hands-on familiarity with any of these databases: PostgreSQL, MongoDB, Redis, Elasticsearch, or RabbitMQ
Awareness of emerging technologies and modern architectural approaches in IT