← All jobs
Our Tech Stack
10 Jun
13 May
As Senior Data Engineer, you will own one of our most business-critical data assets: the system that links customer identities across our businesses and powers better decisions in marketing, CRM, reporting, and analytics. You will join our Business Intelligence & Data Engineering team and work closely with Data Engineers and Business Analysts to build reliable, scalable, and trustworthy customer identity data.
- Own the end-to-end pipeline that creates the unified customer_uuid across Books & Media and Fashion
- Maintain and evolve our customer identity master data with a strong focus on accuracy, reliability, and production quality
- Improve our probabilistic identity resolution model and make matching decisions measurable, transparent, and explainable
- Build scalable and cost-efficient data pipelines across BigQuery, GCS, and Cloud Run Jobs
- Introduce diagnostics, monitoring, and structured validation for every relevant model change
- Identify and resolve edge cases in customer matching logic before they become production issues
- Work closely with business and technical stakeholders to turn complex matching challenges into robust data solutions
Our Tech Stack
- BigQuery
- SQL
- Python
- Airflow
- Splink
- Google Cloud Storage
- Cloud Run Jobs
- Pub/Sub
Must-Have:
- 5+ years of experience in production data engineering
- Strong experience with BigQuery and advanced SQL in large-scale analytical environments
- Strong Python skills for production-grade data engineering
- Solid Airflow experience and a strong understanding of reliable orchestration patterns
- Hands-on experience with incremental pipelines and idempotent data processing
- Experience with probabilistic record linkage or entity resolution in production
- Strong understanding of data quality, matching logic, and precision/recall trade-offs
- A careful, structured, and ownership-driven way of working
- Strong communication skills and the ability to explain technical decisions clearly
Nice-To-Have:
- Experience with Splink and probabilistic record linkage tools
- Experience with Cloud Run Jobs, GCS, and event-driven patterns in GCP
- Experience with Pub/Sub as a source in data pipelines
- Familiarity with data format trade-offs such as Parquet, Avro
- Experience with dbt
- Exposure to downstream BI use cases
- Experience in e-commerce or marketplace environments
- German language skills
Related roles
Leadership
Data Engineer
Data Engineer
Schwarz Digits
Senior Data Engineer - Google Cloud Platform Berlin (m/w/d)
Data Engineer