Loading...

Cloudera Hadoop Engineer (Big Data Engineer)

Location: Delhi NCR, India

Experience: 4 - 7 yrs

Job Type: Full-Time / Contract

Education:

  • UG: B.Tech/B.E. in Computer Science, Engineering, Data Engineering, or a related field
  • PG: Any Postgraduate (Preferred)

Job Description

Project Role Description: We are looking for a Cloudera Hadoop Engineer with strong experience in the Hadoop ecosystem and Cloudera Data Platform (CDP) to build and manage large-scale data processing pipelines. The candidate will work closely with data architects, administrators, and analytics teams to design, develop, and optimize big data workflows and data ingestion pipelines on distributed clusters. The role requires expertise in Hadoop ecosystem technologies such as Hive, Impala, Spark, Airflow, and HDFS, along with experience working on Cloudera-based big data environments.

Key Responsibilities:

Big Data Development

  • Design and develop data pipelines and ETL workflows using Hadoop ecosystem tools.
  • Build and optimize large-scale data processing jobs on distributed clusters.
  • Process structured and unstructured datasets using Hive, Spark, and Impala.

Cloudera Platform Development

  • Develop and manage workloads on Cloudera Data Platform (CDP).
  • Work with services such as Hive, Impala, HDFS, Airflow, and Hue.
  • Optimize queries and workloads running on the Cloudera ecosystem.

Data Ingestion & Processing

  • Design pipelines for data ingestion from multiple sources including databases, APIs, and files.
  • Implement batch and scheduled data workflows using Airflow or similar orchestration tools.
  • Ensure data quality, transformation, and efficient storage in distributed systems.

Performance Optimization

  • Tune Hive and Impala queries for performance and scalability.
  • Optimize data partitioning, indexing, and storage formats for big data processing.
  • Monitor and improve performance of data pipelines.

Collaboration & Integration

  • Work with data scientists, analysts, and business teams to understand data requirements.
  • Integrate big data solutions with analytics platforms and reporting systems.
  • Collaborate with Cloudera administrators to ensure cluster efficiency.

Documentation & Best Practices

  • Maintain documentation for data pipelines, architecture, and workflows.
  • Follow best practices for data governance, security, and performance optimization.
Qualifications:
  • Bachelor's degree in Computer Science, Engineering, Data Engineering, or related field.
  • 4–7 years of experience in Big Data / Hadoop development.
Required Skills:
  • Strong experience with Hadoop ecosystem tools.
  • Hands-on experience with Hive, Impala, HDFS, and Spark.
  • Experience working with Cloudera Data Platform (CDP).
  • Experience building ETL pipelines and data workflows.
  • Knowledge of SQL and big data query optimization.
  • Experience with workflow orchestration tools such as Airflow.
  • Good understanding of distributed data processing concepts.
Preferred Skills:
  • Experience with Python, Scala, or Java for big data development.
  • Knowledge of data lake architecture and big data design patterns.
  • Familiarity with data governance tools like Ranger and Atlas.
  • Experience integrating big data platforms with analytics and BI tools.

Why Choose Us

We're Best in Data Industry with 10 Years of Experience

We’re leaders in the data industry with over 10 years of experience, delivering innovative data solutions that drive business transformation. Our expertise in data pipeline creation has empowered various clients across industries to harness the full potential of their data. For a global fintech firm, we built real-time data pipelines enabling instant fraud detection and risk monitoring. For a leading retail company, we developed scalable pipelines for real-time sales and inventory tracking. Additionally, for a healthcare provider, we created pipelines for secure, real-time patient data processing, improving care and compliance.

Real time Data Ingestion
Batch Data Ingestion
Event Handling on Moving data

21

Happy Clients

84

Project Complete

Cloudera Hadoop Engineer Job