SRE

140 - 180 PLN/ godz.B2B (netto)
SeniorFull-time·B2B
#308009·Dodano dwa miesiące temu·58
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

DNSTCPHTTPBGPProxyTerraformAnsibleAWS LambdaGitLab CIUnixLinuxDatadogGrafanaCDNOTT

Firma i stanowisko

Antal is hiring a Site Reliability Engineer (SRE) to support the development and maintenance of international CDN platforms, both on-premises and cloud-based, impacting OTT services globally.


Wymagania

  • Higher technical education (Computer Science, Networks/Telecommunications).
  • Minimum 4–5 years experience in SysOps / DevOps / SRE roles.
  • Solid networking fundamentals: DNS, TCP, HTTP, routing (BGP), caching, proxy.
  • Knowledge of tools: Terraform, Ansible, AWS Lambda, GitLab CI/CD.
  • Very good knowledge of Unix/Linux systems.
  • Experience with monitoring tools (Datadog, Grafana).
  • Nice to have: knowledge of CDN/OTT and QoS topics.
  • Analytical thinking, independence, and good work organization.
  • Ability to collaborate with technical and non-technical teams.
  • Fluent English; French is a plus.
  • Motivation to work on large-scale, highly available systems.
  • Interest in automation, observability, and performance engineering.

Obowiązki

CDN Reliability & Operations:

  • Ensure availability, resilience, and high performance of CDN platforms (cloud, baremetal, international networks, IXs, ISP-side cache).
  • Regularly analyze CDN capacity, performance, and traffic forecasts.
  • Participate in deployments, production rollouts, and OTT consumption analysis across multiple regions.
  • Monitor key metrics (latency, throughput, cache hit ratio, error rate) and implement optimizations.
  • Participate in incident handling, root cause analysis, and reliability improvement plans.
  • Occasionally support DevOps teams with operational tasks.

Observability & Monitoring:

  • Build and maintain observability layers for all CDN environments using Datadog.
  • Create and maintain standardized dashboards, alerts, SLO/SLA, and log pipelines.
  • Design scalable monitoring solutions capable of handling large traffic volumes.
  • Implement automated health checks, anomaly detection, and alert workflows.
  • Improve data collection and visualization processes for technical and business teams.

Development of Tools & Automation:

  • Develop scripts and workflows (Python/Bash/API) for metrics collection, cost analysis, and operational data.
  • Build internal tools for log analysis, audience and traffic visualization, CDN configuration validation, diagnostics, troubleshooting, and cache testing.
  • Support automation based on Terraform, CI/CD, and automatic configuration rollouts.

Collaboration & CDN Governance:

  • Collaborate with OTT engineering, DevOps, Network, Security, Data teams, and international units.
  • Develop and maintain global standards (latency, TTL, caching, observability, security, costs).
  • Share knowledge with teams across Europe, Africa, and Asia.
  • Prepare technical documentation and deployment materials.
  • Work with ISPs, cloud providers, and operations teams to resolve distribution issues.
  • Support major events (sports, live, peak traffic) with preparation, monitoring, and post-event analysis.

Oferta

  • Start: April/May 2026
  • Contract type: B2B
  • Long-term cooperation
  • 100% remote work
  • Benefits: Multisport card and Luxmed healthcare
Karta sportowa
Opieka zdrowotna
Antal Sp. z o.o.

Antal Sp. z o.o.

940 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz