At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk.
In this role, you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology.
- Developing and implementing data processing workflows using Apache Spark and Scala
- Collaborating with cross-functional teams to identify and prioritize data requirements
- Designing and implementing data transformations, data quality checks, and data validation
- Ensuring data integrity, security, and compliance with industry standards
- Troubleshooting and optimizing data processing systems
* Advanced proficiency in Scala programming language and Apache Spark ecosystem (including Spark Core, Spark SQL, Spark Streaming, etc.)
* Proficiency in Python (version 3.9 or higher) and Airflow
* Strong proficiency in data storage solutions (e.g., HDFS, S3, Cassandra, etc.)
* Experience with data processing frameworks (e.g., Apache Beam, Apache Flink, etc.)
* Strong analytical and problem-solving skills with at least 3 years of experience in troubleshooting and optimizing data processing systems
* Experience with containerization (e.g., Docker) and orchestration (e.g., Kubernetes)
- Experience with cloud-based big data platforms (e.g., AWS EMR, Google Cloud Dataproc, etc.)
- Proficiency in data governance and data quality tools (e.g., data validation, data cleansing, etc.)
- Familiarity with agile development methodologies and version control systems (e.g., Git)
- Advanced experience with data security and compliance frameworks (e.g., GDPR, HIPAA, etc.) and data encryption techniques (e.g., SSL/TLS, etc.)