· 5+ years experience in data engineering
· 2+ years experience in production Spark Streaming
· Deep understanding of Iceberg internals, lakehouse patterns, and large-scale analytics
· Proven experience implementing SCDs in distributed systems
· Strong Python & SQL, comfortable reading JVM-level Spark behavior
· Experience running Spark on Kubernetes
· Operational mindset: monitoring, alerting, incident response
· Zero tolerance for fragile pipelines or undocumented logic
Nice to have
· CDC systems (Debezium, Kafka-based ingestion)
· Performance tuning under cost and latency constraints
· On-prem or hybrid data platforms
This role is not for dashboard builders or pure analysts.
You will own core data flows in production.