A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe.
You'll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio; including Software and Red Hat.
Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, you'll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience.
Who you are: A highly skilled Data Engineer specializing in Data Modeling with experience in designing, implementing, and optimizing data structures that support the storage, retrieval and processing of data for large-scale enterprise environments. Having expertise in conceptual, logical, and physical data modeling, along with a deep understanding of ETL processes, data lake architectures, and modern data platforms.
Proficient in ERwin, PostgreSQL, Apache Iceberg, Cloudera Data Platform, and Denodo. Possess ability to work with cross-functional teams, data architects, and business stakeholders ensures that data models align with enterprise data strategies and support analytical use cases effectively.
What you’ll do: As a Data Engineer – Data Modeling, you will be responsible for:
Data Modeling & Architecture
• Designing and developing conceptual, logical, and physical data models to support data migration from IIAS to Cloudera Data Lake.
• Creating and optimizing data models for structured, semi-structured, and unstructured data stored in Apache Iceberg tables on Cloudera.
• Establishing data lineage and metadata management for the new data platform.
• Implementing Denodo-based data virtualization models to ensure seamless data access across multiple sources.
Data Governance & Quality
• Ensuring data integrity, consistency, and compliance with regulatory standards, including Banking/regulatory guidelines.
• Implementing Talend Data Quality (DQ) solutions to maintain high data accuracy.
• Defining and enforcing naming conventions, data definitions, and business rules for structured and semi-structured data.
ETL & Data Pipeline Optimization
• Supporting the migration of ETL workflows from IBM DataStage to PySpark, ensuring models align with the new ingestion framework.
• Collaborating with data engineers to define schema evolution strategies for Iceberg tables.
• Ensuring performance optimization for large-scale data processing on Cloudera.
Collaboration & Documentation
• Working closely with business analysts, architects, and developers to translate business requirements into scalable data models.
• Documenting data dictionary, entity relationships, and mapping specifications for data migration.
• Supporting reporting and analytics teams (Qlik Sense/Tableau) by providing well-structured data models.
4-7 years of experience in data modeling, database design, and data engineering.
• Hands-on experience with ERwin Data Modeler for creating and managing data models.
• Strong knowledge of relational databases (PostgreSQL) and big data platforms (Cloudera, Apache Iceberg).
• Proficiency in SQL and NoSQL database concepts.
• Understanding of data governance, metadata management, and data security principles.
• Familiarity with ETL processes and data pipeline optimization.
• Strong analytical, problem-solving, and documentation skills.
Experience working on Cloudera migration projects.
• Exposure to Denodo for data virtualization and Talend DQ for data quality management.
• Knowledge of Kafka, Airflow, and PySpark for data processing.
• Familiarity with GitLab, Sonatype Nexus, and CheckMarx for CI/CD and security compliance.
• Certifications in Data Modeling, Cloudera Data Engineering, or IBM Data Solutions.