Data Engineer (m/f/d)
Brak informacji o wynagrodzeniu
SeniorFull-time
#368931·Dodano wczoraj·0
Źródło: InPostTech Stack / Keywords
Apache SparkSparkSQLDatabricksPySparkScalaApache KafkaKafka
Firma i stanowisko
InPost is a leading European e-commerce enablement platform specializing in parcel delivery through a network of Automated Parcel Machines (APMs) and pick-up drop-off points across nine countries. Founded in 1999, it offers delivery and fulfilment services to e-commerce merchants, focusing on flexible, convenient, and environmentally friendly delivery options.
Wymagania
- Minimum 3 years of experience in Data Engineering or similar role.
- Hands-on experience with Apache Spark (Streaming, Spark SQL, MLlib) and Databricks (PySpark, Scala).
- Practical experience with Apache Kafka including Kafka Streams and Kafka Connect.
- Proficiency in Python; working knowledge of Scala or Java.
- Experience designing and operating SQL databases (PostgreSQL, BigQuery, Spark SQL) and NoSQL databases (MongoDB, Cassandra, or similar).
- Experience building and maintaining data lake environments (Delta Lake, Parquet, or equivalent).
- Familiarity with cloud platforms (GCP, Azure, AWS) and managed data services.
- Experience integrating data via REST and/or SOAP APIs.
- Working knowledge of CI/CD tools (GitLab CI, Jenkins, or equivalent) and software engineering practices.
- Experience building and running Docker containers.
- Willingness to share knowledge and contribute to engineering best practices.
- Professional working proficiency in English and Polish.
Nice to Have:
- Experience in international, multi-market environments.
- Exposure to ML pipeline engineering or feature store design.
- Familiarity with data orchestration tools (Apache Airflow, Prefect, Databricks Workflows).
- Experience with Infrastructure as Code (Terraform, Ansible).
- Contributions to open-source data engineering projects.
Obowiązki
Data Platform & Lake Engineering:
- Design, build, and maintain scalable data lake solutions and processing pipelines for structured and semi-structured data.
- Work with batch and streaming architectures balancing latency, cost, and reliability.
Streaming Solutions:
- Build and operate real-time data streaming pipelines using Apache Kafka and its ecosystem.
- Design event-driven architectures for operational monitoring and near-real-time ML feature generation.
ETL/ELT Design and Maintenance:
- Architect and maintain ETL/ELT pipelines focusing on data quality, idempotency, and observability.
- Collaborate with data consumers to translate requirements into durable pipeline designs.
Spark and Databricks Development:
- Develop distributed data processing applications using Apache Spark (PySpark, Scala) on Databricks.
- Apply Spark best practices to ensure efficient job execution.
Database Engineering:
- Design and manage SQL and NoSQL databases including schema design and query optimization.
Cloud-Native Solutions:
- Build data solutions on cloud platforms (GCP, Azure, AWS) using managed services.
- Contribute to cloud architecture decisions within the squad.
CI/CD and Engineering Excellence:
- Apply software engineering best practices to data pipelines including version control, automated testing, code review, and CI/CD.
Performance Monitoring and Optimisation:
- Own operational health of data infrastructure and ETL processes, set up monitoring, respond to incidents, and optimize performance.
API and System Integration:
- Integrate data from internal and external sources via REST and SOAP APIs with reliable ingestion and error handling.
Knowledge Sharing and Community:
- Contribute to the data engineering community through code reviews, documentation, tech talks, and mentoring.
Oferta
- Option to work from the office or 100% remotely.
- Opportunity to work in a diverse, international, and cross-functional environment with leading experts.
- Career development with training opportunities.
- Involvement in technology monitoring and decision-making.
- Immediate visible impact on users' lives.
- B2B type of cooperation offered.
InPost
27 aktywnych ofert