Are you passionate about building scalable data solutions and working with cutting-edge technologies? We're looking for a Data Engineer with deep expertise in Databricks to help us design and optimize modern data processing frameworks.
In this role, you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology.
- Design and build streaming & batch data pipelines using Apache Spark on Databricks
- Optimize performance with Delta Lake, Z-ordering, and liquid clustering
- Work hands-on with PySpark, SQL, and develop our modern Data Lakehouse architecture
- Leverage the latest Databricks features: Structured Streaming, Lakeflow, Unity Catalog, DBSQL
- Support AI/ML workflows and implement CI/CD pipelines with tools like GitHub Actions or Azure DevOps
- 3+ years in data/software engineering with large-scale distributed systems.
- Strong Python skills (OOP, pytest/unittest) and good experience with PySpark, SQLGlot, Pydantic
- Solid understanding of medallion architecture, data ingestion, and big data optimization
- Familiarity with Git workflows and platforms like GitHub, GitLab, or Azure DevOps
- Experience with cloud platforms (Azure preferred) and Databricks.
- Certifications in Databricks or Azure Data Engineering
- Experience with real-time data processing (e.g., Kafka, Spark Streaming)
- Exposure to AI/ML workflows and tools like MLflow.