← All jobs
momox
momox

Senior Data Engineer (m/f/d) – Customer Identity Resolution

Data Engineer Berlin
Full-time

As Senior Data Engineer, you will own one of our most business-critical data assets: the system that links customer identities across our businesses and powers better decisions in marketing, CRM, reporting, and analytics. You will join our Business Intelligence & Data Engineering team and work closely with Data Engineers and Business Analysts to build reliable, scalable, and trustworthy customer identity data.

  • Own the end-to-end pipeline that creates the unified customer_uuid across Books & Media and Fashion
  • Maintain and evolve our customer identity master data with a strong focus on accuracy, reliability, and production quality
  • Improve our probabilistic identity resolution model and make matching decisions measurable, transparent, and explainable
  • Build scalable and cost-efficient data pipelines across BigQuery, GCS, and Cloud Run Jobs
  • Introduce diagnostics, monitoring, and structured validation for every relevant model change
  • Identify and resolve edge cases in customer matching logic before they become production issues
  • Work closely with business and technical stakeholders to turn complex matching challenges into robust data solutions

Our Tech Stack

  • BigQuery
  • SQL
  • Python
  • Airflow
  • Splink
  • Google Cloud Storage
  • Cloud Run Jobs
  • Pub/Sub

Must-Have:

  • 5+ years of experience in production data engineering
  • Strong experience with BigQuery and advanced SQL in large-scale analytical environments
  • Strong Python skills for production-grade data engineering
  • Solid Airflow experience and a strong understanding of reliable orchestration patterns
  • Hands-on experience with incremental pipelines and idempotent data processing
  • Experience with probabilistic record linkage or entity resolution in production
  • Strong understanding of data quality, matching logic, and precision/recall trade-offs
  • A careful, structured, and ownership-driven way of working
  • Strong communication skills and the ability to explain technical decisions clearly

Nice-To-Have:

  • Experience with Splink and probabilistic record linkage tools
  • Experience with Cloud Run Jobs, GCS, and event-driven patterns in GCP
  • Experience with Pub/Sub as a source in data pipelines
  • Familiarity with data format trade-offs such as Parquet, Avro
  • Experience with dbt
  • Exposure to downstream BI use cases
  • Experience in e-commerce or marketplace environments
  • German language skills

Related roles

idealo
Bees & Bears

Bees & Bears

Data Engineer (m/f/d) – Berlin

Hybrid
Data Engineer
13 May
Qonto
Data Engineer