Data Engineer

42 - 45 EUR/ godz.B2B (netto)
SeniorFull-time·B2B
#327670·Dodano 21 dni temu·21
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

ETLNLPDatabricksSparkSQLPythonCloud platform

Firma i stanowisko

For our client, one of the Global Pharmaceutical Company, we realize a recruitment process for Data Engineer role.


Wymagania

  • At least 5 years of experience as Data Engineer.
  • Strong data engineering skills: Databricks, Spark, Delta Lake, SQL, ETL design and orchestration.
  • Familiarity with clinical trial concepts (inclusion/exclusion criteria, endpoints, demographics) and biomedical terminologies.
  • Practical experience with data modeling and working with end users to define requirements.
  • Experience with CI/CD, testing frameworks, and monitoring for data pipelines and ML models.
  • Experience with NLP for information extraction from scientific text (publications, registries).
  • Fluency in English both written and spoken.

Nice to have:

  • Experience with pharmaceutical sector or clinical research data environments.

Obowiązki

  • Combine clinical data expertise with strong data engineering and technical skills to generate well documented pipelines from source to curated data sets in common data models like CDISC SDTM.
  • Collaborate closely with clinical SMEs, data scientists, infrastructure, and other skilled data engineers.
  • Include external benchmarking data as a FounData product and help automate extraction and harmonisation of competitor clinical trial data from public registries and publications into structured, analysis ready formats.
  • Productionise and monitor pipelines and models; collaborate on CI/CD, testing, and user feedback.
  • Implement ETL patterns (medallion architecture), ensuring data provenance, validation, and versioning.
  • Take part of continuous improvement and validation of existing pipelines.
  • Ensure clinical concepts are correctly represented and harmonized across data models (CDISC SDTM/ADaM, OMOP, HL7); contribute to mapping and transformation logic.
  • Develop NLP models for entity and relation extraction (e.g., inclusion/exclusion criteria, demographics, endpoints, study design).
  • Build automated pipelines to ingest registry and publication data and convert to tabular, queryable datasets.
  • Co design the benchmarking data model with end users and map extracted information to standardized terminologies.
  • Integrate human in the loop review, confidence scoring, and vocabulary/units normalization.

Oferta

  • Sport Subscription
  • Private healthcare
  • International environment
  • Life insurance
Karta sportowa
Opieka zdrowotna
Ework Group

Ework Group

64 aktywne oferty

Zobacz wszystkie oferty
Aplikuj teraz