DCG
DCG
New

Senior DevOps Engineer (AI & Platform Operations)

120 - 125 PLN/ godz.B2B
SeniorFull-time·B2B
#365137·Dodano dziś·0
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

kubernetesITILSplunkApicaSysdigPrometheusGrafanaJavaJava EEBashPython

Wymagania

  • 5+ years in IT operations, application support (2nd/3rd line), or a similar production-facing role
  • Proven track record of owning incidents end-to-end — from alert to RCA to prevention
  • 2+ years working within an ITIL framework (incident, problem, change management)
  • Experience working in Agile delivery environments alongside development teams
  • Excellent English communication skills — able to explain technical issues clearly to both engineers and non-technical stakeholders
  • Proficiency with log analysis and alerting tools: Splunk, Apica, Sysdig
  • Observability tooling: Prometheus, Grafana — reading dashboards, tuning alerts
  • Comfortable operating services running on Kubernetes (checking pod health, reading logs, triggering restarts — not cluster administration)
  • Familiarity with Jenkins pipelines to execute and troubleshoot deployments
  • Experience with relational databases (Oracle, DB2) — querying, interpreting execution plans, identifying data-related incidents
  • Working knowledge of Spring/Hibernate application behavior, Kafka message flows, XML/JSON payloads — enough to trace an issue through the stack

Nice to have:

  • Java/J2EE development background
  • IBM Datastage operational experience
  • Scripting (Bash, Python) for automation of repetitive operational tasks
  • Ansible for applying configuration changes in controlled operational scenarios

Obowiązki

Incident & Problem Management:

  • Own the RCA process for production incidents — diagnose, resolve, and put preventive measures in place so issues don't recur

Production Monitoring & Support:

  • Continuously monitor service health, detect anomalies early, and act before they become incidents

Deployment Execution:

  • Trigger and oversee release deployments through existing CI/CD pipelines; troubleshoot failed deployments and coordinate rollbacks when needed

Environment Oversight:

  • Keep Pre-Production and Production environments stable and aligned — not building them from scratch, but ensuring they behave as expected day to day

Runbook & Knowledge Management:

  • Document operational procedures, known issues, and resolution steps to build a reliable knowledge base for the team

Cross-team Collaboration:

  • Work shoulder-to-shoulder with development and platform teams to triage issues, clarify operational requirements, and close the feedback loop between prod and dev

Continuous Improvement:

  • Identify recurring pain points and propose automation or tooling to reduce toil
  • Improve observability coverage — dashboards, alerts, log queries — to catch issues faster
  • Contribute to service continuity initiatives and disaster recovery drills

Oferta

  • Private medical care
  • Co-financing for the sports card
  • Constant support of dedicated consultant
  • Employee referral program
Opieka zdrowotna
Karta sportowa
DCG

DCG

364 aktywne oferty

Zobacz wszystkie oferty
Aplikuj teraz