XTB
XTB
New

Senior Site Reliability Engineer

23k - 29.2k PLN/ mies.UoP
SeniorFull-time·Umowa o pracę
#371680·Dodano wczoraj·0
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

PythonKubernetesDockerAnsibleGrafanaELK StackPrometheus

Firma i stanowisko

XTB is a global company from the financial industry, focusing on online trading of financial instruments. It is the largest FinTech in Poland and a leader in Central and Eastern Europe, operating in several countries including Asia and South America. XTB offers opportunities for employee development through various training and development programs.

Wymagania

  • At least 5 years of professional experience in SRE, Infrastructure, or DevOps roles managing high-scale, distributed environments.
  • Advanced programming skills in Python focused on scalable automation, internal tooling, and robust scripts.
  • Hands-on expertise managing production-grade Kubernetes environments and configuration management tools like Ansible.
  • Experience designing resilient infrastructure architectures within Azure Kubernetes Service and on-prem environments.
  • Proficiency in building standardized telemetry ecosystems using self-hosted open-source tools such as Prometheus, Grafana, ELK Stack, Tempo, Thanos, and Jaeger.
  • Ability to drive incident management, conduct post-incident analysis, and foster a culture of reliability and shared ownership.
  • Ability to leverage AI/ML techniques for SRE tasks including AIOps, automated anomaly detection, log analysis, and optimizing reliability workflows.
  • Experience with commercial observability and APM solutions (e.g., Datadog, Splunk, New Relic) or chaos engineering frameworks is highly valued.

Obowiązki

Observability Platform Engineering:

  • Develop a standardized observability ecosystem.
  • Implement a conscious telemetry model focusing on structured events, distributed tracing, and intelligent sampling strategies.

Reliability Enablement:

  • Act as a strategic partner to product engineering teams.
  • Provide platform, standards, and data to own service reliability.
  • Use error budgets and alerting to balance feature velocity with stability.

Proactive Resilience & Protection:

  • Enhance detection capabilities to identify issues before customer impact.
  • Leverage early-warning systems and AI/ML for automated anomaly detection and intelligent data analysis.

Operations & Tooling:

  • Build internal automation and tooling to streamline SRE workflows and automate routine operational tasks.

Incident Management & On-Call Rotation:

  • Participate in on-call rotation for incident management.
  • Ensure rapid incident resolution, effective communication, and post-incident analysis for continuous improvement.

Benefity

  • Sport subscription
  • Training budget
  • Private healthcare
  • Lunch card
  • An extra day off on your birthday
  • An extra day off for parents
  • Access to an e-learning platform for learning English
Karta sportowa
Dofinansowanie szkoleń
Opieka zdrowotna
Firmowa stołówka
Płatny urlop
Szkolenia wewnętrzne
XTB

XTB

39 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz