Tech Stack / Keywords
PythonETLSQLSnowflakeApache SparkPySparkSparkCloud
Firma i stanowisko
At Pretius, we are looking for Data Engineer to a project for global-scale platform in the field of gaming and lotteries.
Wymagania
- 8+ years of experience in data engineering, analytics engineering, or similar data-focused roles
- Expert-level proficiency in Python for data processing, pipeline development, and automation
- Advanced SQL skills, including query optimization and complex analytical transformations
- Strong experience with relational and analytical databases (e.g., PostgreSQL, Snowflake, BigQuery, Redshift, Synapse)
- Hands-on experience designing and implementing data warehouse architectures (ETL/ELT, batch, near-real-time)
- Proven experience with big data processing frameworks such as Apache Spark (PySpark, Spark SQL)
- Strong cloud experience across AWS, Azure, and/or GCP, including core data services
- Experience building and operating scalable data pipelines using orchestration tools (Airflow, ADF, Prefect, Dagster)
- Understanding of distributed systems principles and large-scale data processing challenges
- Strong knowledge of data quality, governance, security, and compliance best practices
- Experience with DevOps practices, including CI/CD, Git, and Infrastructure as Code (Terraform or equivalent)
- Ability to design scalable, production-grade data solutions in complex enterprise environments
Nice to have:
- Familiarity with streaming technologies (Kafka, Kinesis, Pub/Sub)
- Experience with dbt and BI tools (Power BI, Tableau, Looker)
Obowiązki
- Design, build, and maintain scalable, production-grade data pipelines using Python (ETL/ELT) and orchestration tools
- Write and optimize advanced SQL queries for efficient data extraction, transformation, and performance tuning
- Design and implement scalable data models (star/snowflake schema) for analytics and reporting
- Build and maintain end-to-end data warehouse solutions, including batch and near-real-time ingestion, data marts, and semantic layers
- Work with Apache Spark (PySpark, Spark SQL) for large-scale data processing and analytics
- Develop and operate cloud data solutions across AWS, Azure, and/or GCP (e.g., S3, Glue, EMR, Redshift, ADLS, Data Factory, Synapse, BigQuery)
- Design scalable, secure, and cost-efficient data architectures with FinOps awareness
- Build and maintain reliable data pipelines using orchestration tools (Airflow, ADF, Prefect, Dagster) with proper scheduling, retries, and monitoring
- Ensure data reliability through validation, monitoring, idempotent design, and failure recovery mechanisms
- Develop streaming and real-time data pipelines using Kafka, Kinesis, Pub/Sub, or Event Hubs where required
- Implement data quality, governance, and security standards (PII protection, encryption, RBAC, data lineage)
- Apply DevOps practices including Git, CI/CD, Infrastructure as Code, and production monitoring
- Integrate external APIs and SaaS data sources into data platforms
Oferta
- Co-financing of the Multisport card and Medicover private healthcare
- Modern office available
- Team bonding activities, internal courses, conferences, certifications
Karta sportowa
Opieka zdrowotna
Imprezy teamowe
Szkolenia wewnętrzne
Budżet konferencyjny
Pretius
19 aktywnych ofert