Senior Data Engineer
Brak informacji o wynagrodzeniu
SeniorFull-time·Umowa o pracę·B2B
#381865·Dodano dziś·0
Źródło: justjoin.itTech Stack / Keywords
AI Data PipelinesPostgreSQLDatabricks
Firma i stanowisko
Webellian is a well-established Digital Transformation and IT consulting company committed to creating a positive impact for clients across sectors such as insurance, banking, healthcare, retail, and manufacturing. The company focuses on cutting-edge and disruptive technologies and is a community of engineers and senior advisors accelerating client vision and strategy.
Wymagania
- 6+ years professional data engineering experience delivering production data pipelines.
- Expert-level SQL and PostgreSQL skills: advanced query optimization, schema design, indexing, partitioning, MVCC, connection management.
- Strong Databricks experience: Delta Lake, PySpark, Workflows, Unity Catalog, Spark job tuning.
- Proficient in Python for data pipelines: pandas, PySpark, data validation libraries (e.g., Great Expectations).
- Experience with data orchestration frameworks: Apache Airflow, Databricks Workflows, or equivalent.
- Understanding data integration patterns: CDC with Debezium or equivalent, Kafka-based streaming, batch ingestion.
- Hands-on with data lakehouse architecture: medallion architecture, Delta Lake ACID transactions, table optimization.
- Experience implementing data quality frameworks and data contracts.
- Familiarity with Azure data services: Azure Data Factory, Azure Event Hubs, Azure Data Lake Storage or equivalents.
- Proficiency with Claude Code for development, SQL authoring, data exploration, documentation.
- Strong communication skills to collaborate with ML Engineers, analysts, and product teams.
Obowiązki
- Design and build scalable data pipelines for batch and real-time AI workloads.
- Develop and maintain Databricks workflows: Delta Lake table management, PySpark transformations, notebook orchestration, and Unity Catalog governance.
- Architect and optimize PostgreSQL data models including schema design, indexing, partitioning, and query tuning.
- Build and maintain data orchestration workflows using Apache Airflow, Databricks Workflows, or equivalent.
- Implement data quality frameworks including validation rules, anomaly detection, and automated alerting.
- Design and manage feature engineering pipelines integrating with feature stores and versioning.
- Own data integration patterns between operational PostgreSQL databases and the Databricks lakehouse, including CDC and Kafka ingestion.
- Implement data governance standards: lineage tracking, cataloguing, access control, PII handling, retention policies, and audit logging.
- Collaborate with ML Engineers on data pipelines for model training, inference, and real-time feature serving.
- Monitor and operate data infrastructure including observability dashboards and incident response.
- Use Claude Code daily for pipeline development, SQL generation, data exploration, and documentation.
Benefity
- Contract under Polish law: B2B or Umowa o Pracę (employment contract).
- Private medical care.
- Group insurance.
- Multisport card.
- English classes.
- Hybrid work with at least 1 day/week on-site in Warsaw (Mokotów).
- Opportunity to work with excellent professionals and international team.
- Focus on high code quality and new technologies.
- Continuous learning and growth opportunities.
- On-site amenities including Pinball and PlayStation.
Opieka zdrowotna
Ubezpieczenie
Karta sportowa
Kursy językowe
Webellian
24 aktywne oferty