As a database production engineer , you will build, operate, and maintain distributed data systems to help leading enterprises manage their complex data needs. You work on automation, monitoring, alerting, enhancement and bug fixes to ensure an amazing experience for our developers and enterprises.
What will you do:
- Ensure reliability, scalability, security and maintainability of the systems you own
- Respond to customer escalations and automated alerts, from the initial triaging all the way to resolution
- Participate in blameless post-mortem analyses to make sure we learn from our mistakes
- Perform manual operational tasks (toil)
- Develop automations to reduce toil
- Improve monitoring and alerting to reduce the time to detection of incidents
- Working with the following technologies:
- Kubernetes, Helm, ArgoCD, Terraform
- Nosql databases (cassandra)
- Java, Python, Go
- AWS, GCP, Azure
- Prometheus, Grafana and Splunk ecosystem
Your experience should include:
- Minimum of 7-9 years of relevant industry experience in Software Engineering
- Practical experience in at least one programming language (e.g. Java, Python)
- Strong analytical thinking, especially when triaging (unknown) issues
- Ability to express your thoughts in an easy-to-understand written form
- Ability to learn and adapt quickly
- Familiarity with software engineering practices (version control, refactoring, automated testing, CI/CD, observability)
- Familiarity with distributed systems design fundamentals and software architecture
- Familiarity with computer science and operating systems fundamentals (e.g. program execution, memory management, networking)
- Bonus points for database fundamentals (more bonus points for C*)
- Bonus points for experience with Linux containers and container orchestration (e.g. Kubernetes)