The Platform and Reliability Engineering (PRE) team at Apptio is responsible for enhancing and maintaining our IDP and driving the adoption of platform best practices across our engineering teams. We are a distributed team working across three locations including the United States, Poland and Australia.
In this role, you will be part of the team building and supports the Internal Developer Platform (IDP) where all Apptio applications are deployed. In a typical day you will interact with Github, Linux, Kubernetes, ArgoCD, Docker, Confluence, Jira, Slack, and AWS.
You Are
Passionate about problem solving and have experience developing platform features designed to improve the developer experience here in Apptio + IBM. Your team can count on you to solve challenging problems across the entire Apptio portfolio. You collaborate with other Platform Engineers, developers, and support teams to help provide value to the broader organization. You take responsibility when fixing problems in an automated code first way and are happy to step outside your comfort zone to develop your skillset.
You Aren’t
A Kubernetes or cloud expert with many years of experience. This is an intermediate position; we want you to help us, and we also want to help you grow.
Responsibilities
- Develop self service features and services specifically designed to improve developer velocity
- Manage deployments of Apptio services via ArgoCD
- Streamline the CI process via Github actions and create resuable templates for our developers consumption
- Improve observability of the services within your purview by reviewing KPI dashboards and alerting
- Author and maintain documentation of deployment and monitoring processes
- Use run-books to troubleshoot and triage production issues
- Detect issues and handle Tier 1-2 troubleshooting
- Participate in online “swarm” collaboration sessions
- Collaborate with Apptio product developers
- Participate in on-call rotation
- Perform maintenance of the platform (patching, resets, upgrades, etc.)
- 1-2 years’ experience in an Platform Development, DevOps, SRE or adjacent role
- Foundational understanding of at least one programming language and source control (Preferably Golang)
- Experience with distributed application deployment and management
- Experience with container technologies (e.g., Kubernetes, Docker)
- Experience with Infrastructure-as-code (IaC) concepts
- Experience with cloud provider services such as AWS, Azure, or Google Cloud Platform
- Familiarity with RESTful systems and their APIs
- Desire working with a remote team
- Fluent English language skills
- Experience with an Internal Developer Platform
- Experience with CNCF products such as Cillium, Karpenter and a good knowledge of the CNCF landscape
- Experience with the Hashicorp product suite (Vault, Terraform, Consul etc)