A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe.
You'll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio; including Software and Red Hat.
Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, you'll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience
Role Overview:
We are hiring a Talend Data Quality Developer to design and implement robust data quality (DQ) frameworks in a Cloudera-based data lakehouse environment. The role focuses on building rule-driven validation and monitoring processes for migrated data pipelines, ensuring high levels of data trust and regulatory compliance across critical banking domains.
Key Responsibilities:
- Design and implement data quality rules using Talend DQ Studio, tailored to validate customer, account, transaction, and KYC datasets within the Cloudera Lakehouse.
- Create reusable templates for profiling, validation, standardization, and exception handling.
- Integrate DQ checks within PySpark-based ingestion and transformation pipelines targeting Apache Iceberg tables.
- Ensure compatibility with Cloudera components (HDFS, Hive, Iceberg, Ranger, Atlas) and job orchestration frameworks (Airflow/Oozie).
- Perform initial and ongoing data profiling on source and target systems to detect data anomalies and drive rule definitions.
- Monitor and report DQ metrics through dashboards and exception reports.
- Work closely with data governance, architecture, and business teams to align DQ rules with enterprise definitions and regulatory requirements.
- Support lineage and metadata integration with tools like Apache Atlas or external catalogs.
· Experience: 5–10 years in data management, with 3+ years in Talend Data Quality tools.
· Platforms: Experience in Cloudera Data Platform (CDP), with understanding of Iceberg, Hive, HDFS, and Sparkecosystems.
· Languages/Tools: Talend Studio (DQ module), SQL, Python (preferred), Bash scripting.
· Data Concepts: Strong grasp of data quality dimensions—completeness, consistency, accuracy, timeliness, uniqueness.
· Banking Exposure: Experience with financial services data (CIF, AML, KYC, product masters) is highly preferred.