Webellian

Senior Data Engineer

Brak informacji o wynagrodzeniu

SeniorFull-time·Umowa o pracę·B2B

#381865·Dodano miesiąc temu·0

Źródło: justjoin.it

Aplikuj teraz

Tech Stack / Keywords

AI Data PipelinesPostgreSQLDatabricks

Firma i stanowisko

Webellian is a well-established Digital Transformation and IT consulting company committed to creating a positive impact for clients across sectors such as insurance, banking, healthcare, retail, and manufacturing. The company focuses on cutting-edge and disruptive technologies and is a community of engineers and senior advisors accelerating client vision and strategy.

Wymagania

6+ years professional data engineering experience delivering production data pipelines.
Expert-level SQL and PostgreSQL skills: advanced query optimization, schema design, indexing, partitioning, MVCC, connection management.
Strong Databricks experience: Delta Lake, PySpark, Workflows, Unity Catalog, Spark job tuning.
Proficient in Python for data pipelines: pandas, PySpark, data validation libraries (e.g., Great Expectations).
Experience with data orchestration frameworks: Apache Airflow, Databricks Workflows, or equivalent.
Understanding data integration patterns: CDC with Debezium or equivalent, Kafka-based streaming, batch ingestion.
Hands-on with data lakehouse architecture: medallion architecture, Delta Lake ACID transactions, table optimization.
Experience implementing data quality frameworks and data contracts.
Familiarity with Azure data services: Azure Data Factory, Azure Event Hubs, Azure Data Lake Storage or equivalents.
Proficiency with Claude Code for development, SQL authoring, data exploration, documentation.
Strong communication skills to collaborate with ML Engineers, analysts, and product teams.

Obowiązki

Design and build scalable data pipelines for batch and real-time AI workloads.
Develop and maintain Databricks workflows: Delta Lake table management, PySpark transformations, notebook orchestration, and Unity Catalog governance.
Architect and optimize PostgreSQL data models including schema design, indexing, partitioning, and query tuning.
Build and maintain data orchestration workflows using Apache Airflow, Databricks Workflows, or equivalent.
Implement data quality frameworks including validation rules, anomaly detection, and automated alerting.
Design and manage feature engineering pipelines integrating with feature stores and versioning.
Own data integration patterns between operational PostgreSQL databases and the Databricks lakehouse, including CDC and Kafka ingestion.
Implement data governance standards: lineage tracking, cataloguing, access control, PII handling, retention policies, and audit logging.
Collaborate with ML Engineers on data pipelines for model training, inference, and real-time feature serving.
Monitor and operate data infrastructure including observability dashboards and incident response.
Use Claude Code daily for pipeline development, SQL generation, data exploration, and documentation.

Benefity

Contract under Polish law: B2B or Umowa o Pracę (employment contract).
Private medical care.
Group insurance.
Multisport card.
English classes.
Hybrid work with at least 1 day/week on-site in Warsaw (Mokotów).
Opportunity to work with excellent professionals and international team.
Focus on high code quality and new technologies.
Continuous learning and growth opportunities.
On-site amenities including Pinball and PlayStation.

Opieka zdrowotna

Ubezpieczenie

Karta sportowa

Kursy językowe

Webellian

19 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz