Design, develop, and maintain complex data flows within Cloudera DataFlow (Apache NiFi), ensuring scalable, reliable, and high-performance data movement across systems.
Develop and optimize real-time and near real-time data pipelines leveraging NiFi, Kafka, and CDC technologies (e.g., Debezium, SQL-based connectors).
Implement integrations with internal and external systems using REST APIs, JDBC, Kafka, and other communication protocols, ensuring secure and resilient data exchange.
Design and manage data schemas (Avro), metadata, and lineage using Apache Atlas, ensuring full traceability and governance of data flows.
Define and enforce data security and access control policies using Apache Ranger in alignment with enterprise governance frameworks.
Monitor, troubleshoot, and optimize data pipelines for performance, reliability, and scalability, including proactive alerting and issue resolution.
Collaborate ...
Ready to Apply?
Join thousands of Americans building their careers