Platform Observability Analyst
Tech Stack / Keywords
Firma i stanowisko
We are a gaming and gambling solution software provider and industry leader in USA, UK and Europe. Our partnership with global Gaming Companies has resulted in the development of cutting-edge technical platforms incorporating sportsbook, lottery, casino, virtual and financial trading.
Our Vision is to shape the future of gaming and transforming experience from gaming operations into intelligent solutions that meet customer needs in the digital era and create value for all stakeholders in sustainable ways.
We value teamwork, knowledge sharing and transparency with accountability. We believe quality, ability and determination are vital ingredients in delivering success at Amelco and that you will be able to instil your imprint with us.
Wymagania
- Experience with Prometheus, Grafana, and alerting pipelines.
- Operational knowledge of AWS services (EC2, ECS/EKS, RDS, S3, CloudWatch).
- Experience with Pingdom, Cloudflare, or similar uptime/performance monitoring tools.
- Understanding of distributed systems, microservices, and cloud-native architecture.
- Experience with log aggregation or observability tools (ELK, Loki, etc.).
- Strong analytical mindset and problem-solving skills.
- Clear documentation and communication skills.
- 3 years of experience in a similar role.
- Knowledge of Prometheus, Grafana, AWS, CloudFlare, CloudWatch.
- English language proficiency.
Obowiązki
- Own and maintain dashboards for system health, performance, and uptime.
- Manage Prometheus, Grafana, Pingdom, Cloudflare, and AWS CloudWatch monitoring.
- Manage alerts, adjust thresholds, and configure notifications in line with operational SLAs.
- Monitor system metrics and logs proactively.
- Respond to system alerts and operational issues.
- Take immediate action on critical incidents, mitigate medium-impact issues, and escalate major events to Incident Management.
- Document all alerts, actions, and resolutions.
- Identify trends and early warning signs in system performance.
- Recommend improvements for monitoring, alerting, and operational efficiency.
- Support post-incident reviews and maintain operational documentation and runbooks.
- Work closely with L2/L3 Support, Incident Management, and DevOps teams.
- Provide clear technical insights to stakeholders.
- The role involves an evening and overnight shift pattern.
Oferta
- Competitive contractor rates.
- B2B open-ended contract.
- Full-time job and long-term working possibilities.
- Exposure to modern cloud and observability tooling.
- Opportunity to shape platform monitoring and reliability practices.
- Strong collaboration with the platform, DevOps, and operations teams.
- Clear progression path toward SRE or Platform Engineering roles.
- Knowledge-sharing opportunities.
- Dynamic culture surrounded by industry experts.
- Enthusiastic and energetic working environment.
- Flat structure.
- No dress code.
- Internal trainings.
- Coffee / Tea and cold beverages.
Amelco
Pracodawca