Data Engineer

Brak informacji o wynagrodzeniu
SeniorFull-time
#359286·Dodano wczoraj·3
Źródło: Ruby Labs
Aplikuj teraz

Tech Stack / Keywords

RubyGoogle CloudAirflowBackendData modelingKafkaBigQueryCloud

Firma i stanowisko

Ruby Labs is a leading tech company that creates and operates innovative consumer products across the health, education, and entertainment industries.


Wymagania

  • Production experience with ClickHouse: data modeling (MergeTree family, projections, materialized views), query tuning, partitioning/sharding, and operational awareness.
  • Strong experience designing event-driven, real-time analytics pipelines (Kafka / Pub/Sub / Kinesis or equivalent), including schema design, backfills, and replay.
  • Hands-on experience with Google Cloud data stack (Pub/Sub, GCS, BigQuery, Cloud Run / GKE, IAM) for ingestion, storage, and orchestration.
  • Production experience with Apache Airflow (DAG design, sensors, retries, SLAs, idempotent tasks, incremental loads).
  • Advanced SQL skills (complex joins, window functions, incremental logic, performance-aware query writing).
  • Strong Python skills for data engineering (pipelines, transformations, tests, tooling).
  • Experience with Git workflow (PRs, code review, CI for SQL/pipelines, versioning of data logic).
  • Experience working with financial / payments / subscription data or comparably high-stakes domains where correctness is non-negotiable.
  • Ability to communicate trade-offs and findings clearly to both technical and non-technical stakeholders.

Nice to have:

  • Tinybird experience (pipes, endpoints, materializations, performance tuning).
  • Experience with dbt, Dataflow / Beam, Spark, or other large-scale processing frameworks.
  • Experience with risk/fraud or billing & payments analytics (auth rate, dispute rate, dunning recovery, BIN/issuer analysis).
  • Experience with experimentation/A/B test data infrastructure.
  • Experience with Terraform / IaC, Docker, Kubernetes for data infrastructure.

Obowiązki

  • Design and build event-driven, real-time data pipelines on ClickHouse, Google Cloud, and Airflow to power dashboards and self-serve analytics.
  • Optimize ClickHouse schemas, materialized views, and queries for performance, correctness, and cost.
  • Model core financial datasets for payments and subscriptions: authorizations, declines, refunds, chargebacks, disputes, dunning, MRR/ARR, LTV, churn, cohort retention.
  • Own data quality and observability: tests, monitoring, alerting, lineage, and SLAs for freshness and correctness on Tier-1 datasets.
  • Investigate anomalies and data-quality issues in financial data, perform root-cause analysis, and drive fixes end-to-end (instrumentation → pipeline → metric).
  • Partner with Backend/Platform on event schemas and instrumentation to ensure high-quality, well-typed billing and payments events.
  • Document logic (metric definitions, tables, pipeline behavior) and build internal tooling/templates that let analysts ship safely on top of the platform.

Oferta

  • Remote Work Environment: work from anywhere, anytime.
  • Unlimited PTO: unlimited paid time off.
  • Paid National Holidays.
  • Company-provided MacBook.
  • Flexible Independent Contractor Agreement offering flexibility, autonomy, tax advantages, networking opportunities, and freedom to work from anywhere.
Płatne święta
Płatny urlop

Inne informacje

Applicants must be located within approximately ± 4 hours of Central European Time (CET) to ensure optimal collaboration and communication during working hours.

Ruby Labs

Ruby Labs

28 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz