Data Engineer-Data Platforms

Introduction

A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe.
You'll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio; including Software and Red Hat.
Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, you'll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience.

Your role and responsibilities

Who you are: A seasoned Senior Data Engineer with deep expertise in modernizing data platforms, ETL migration, and large-scale data processing. Have experience working with IBM DataStage, PySpark, Cloudera Data Platform, and related Big-data technologies. Also proficient in Architecting data platform migration ad transformations, designing, implementing, and optimizing data pipelines and analytical solutions while ensuring high performance, scalability, and secured solution.
Possess strong problem-solving skills, stakeholder management abilities, and a keen understanding of enterprise data architecture with good experience in Banking Data model. Ability to collaborate with cross-functional teams and lead complex migration initiatives.What you’ll do: As a Data Engineer – Data Platform Services, you will be responsible for:
Data Migration & Modernization
• Leading the migration of ETL workflows from IBM DataStage to PySpark, ensuring performance optimization and cost efficiency.
• Designing and implementing data ingestion frameworks using Kafka and PySpark, replacing legacy ETL Pipeline using DataStage.
• Migrating the analytical platform from IBM Integrated Analytics System (IIAS) to Cloudera Data Lake on CDP.
Data Engineering & Pipeline Development
• Developing and maintaining scalable, fault-tolerant, and optimized data pipelines on Cloudera Data Platform.
• Implementing data transformations, enrichment, and quality checks to ensure accuracy and reliability.
• Leveraging Denodo for data virtualization and enabling seamless access to distributed datasets.
Performance Tuning & Optimization
• Optimizing PySpark jobs for efficiency, scalability, and reduced cost on Cloudera.
• Fine-tuning query performance on Iceberg tables and ensuring efficient data storage and retrieval.
• Collaborating with Cloudera ML engineers to integrate machine learning workloads into data pipelines.
Security & Compliance
• Implementing Thales CipherTrust encryption and tokenization mechanisms for secure data processing.
• Ensuring compliance with Bank/regulatory body security guidelines, data governance policies, and best practices.
Collaboration & Leadership
• Working closely with business stakeholders, architects, and data scientists to align solutions with business goals.
• Leading and mentoring junior data engineers, conducting code reviews, and promoting best practices.
• Collaborating with DevOps teams to streamline CI/CD pipelines, using GitLab and Nexus Repository for efficient deployments.

Required education

Bachelor's Degree

Preferred education

Master's Degree

Required technical and professional expertise

12+ years of experience in Data Engineering, ETL, and Data Platform Modernization.
• Hands-on experience in IBM DataStage and PySpark, with a track record of migrating legacy ETL workloads.
• Expertise in Apache Iceberg, Cloudera Data Platform, and Big-data processing frameworks.
• Strong knowledge of Kafka, Airflow, and cloud-native data processing solutions.
• Experience with Denodo for data virtualization and Talend DQ for data quality.
• Proficiency in SQL, NoSQL, and Graph DBs (DGraph Enterprise).
• Strong understanding of data security, encryption, and compliance standards (Thales CipherTrust).
• Experience with DevOps, CI/CD pipelines, GitLab, and Sonatype Nexus Repository.
• Excellent problem-solving, analytical, and communication skills.

Preferred technical and professional experience

Experience with Cloudera migration projects in Banking or financial domains.
• Experience working with Banking Data model.
• Knowledge of Cloudera ML, Qlik Sense/Tableau reporting, and integration with data lakes.
• Hands-on experience with QuerySurge for automated data testing.
• Understanding of code quality and security best practices using CheckMarx.
• IBM, Cloudera, or AWS/GCP certifications in Data Engineering, Cloud, or Security.
• “Meghdoot” Cloud platform knowledge.
• Architectural designing and recommendations the best possible solutions.

IBM

ABOUT BUSINESS UNIT

YOUR LIFE @ IBM

ABOUT IBM

OTHER RELEVANT JOB DETAILS