America's Job Portal
KEY RESPONSIBILITES AND SKILL SET
- Design and develop highly scalable, Real time systems using Hadoop ecosystem components(Iceberg, Spark, Ozone, Trino, Hive, Ranger, Kafka, Flink and Nifi
- Build robust data ingestion and transformation frameworks using Java, Spark, Python, and shell scripting for ingesting multi model data(image, audio, video, unstructured documents) with both batch and real-time
- Develop full stack applications and internal engineering tools using Python, shell scripting, and modern web frameworks (e.g., Flask, React)
- learning models using Cloudera Machine Learning (CML).
β’ Perform performance tuning and optimization of data applications on Hadoop to ensure optimal resource utilization.
β’ Experience working with ML platforms such as CML, Spark MLlib, and Python ML libraries (scikit learn, XBoost), including model deployment.
- Design and develop highly scalable, Real time systems using Hado...