Are you ready to take your reliability engineering skills to the next level — in the high-stakes world of iGaming?

We are expanding the team responsible for ensuring the reliability and stability of our production systems across multiple platforms. In this role, you will work directly with live environments — monitoring, responding to incidents, and improving observability and operational processes.

This position is suited for engineers with real production exposure who understand that SRE is more than running infrastructure — it’s about reliable services, fast detection, effective response, and continuous improvement. A high-level understanding of SLI/SLO is expected.

You will work in a shift-based setup, including late-evening and night rotations, taking ownership of incidents from detection to resolution and contributing to making systems more stable over time.

All you need is:

Core skills:

Good Linux skills in production environments (debugging basics, system services, logs, performance basics);

Solid understanding of networking fundamentals (TCP/IP, DNS, HTTP, load balancing basics, TLS fundamentals);

Experience with containers and image lifecycle basics (Docker or compatible runtimes);

Ability to troubleshoot across application, network, and infrastructure layers using logs/metrics and basic tools (curl, basic traffic/log analysis; scripting is a plus);

Basic familiarity with observability: metrics and alerting, dashboards, logging.

SRE fundamentals:

You understand the difference between “just running infra” and SRE as a discipline: reliability targets, fast detection, clear escalation, and consistent follow-up;

You’re familiar with SLI/SLO and can explain them in simple terms (a high-level understanding is enough).

Experience:

1+ year in a production-focused role (Ops/Support L2+/DevOps/Junior SRE — what matters is real production exposure);

Participation in production incidents (triage, investigation, escalation, basic follow-ups);

Availability to cover late-evening and night shifts, in rotation.

Also, it will be great if you have:

Familiarity with Kubernetes (we don’t require deep production ownership yet);

Exposure to AWS services such as EC2, ALB/NLB, RDS, S3, and IAM basics;

Exposure to Terraform and/or Ansible (small changes, basic understanding of principles);

Experience working in high-availability environments where downtime actually matters.

Your daily adventures will look like:

Working in shift-based operations: monitoring, alert response, incident handling, escalation when needed;

Participating in incident handling: initial classification, technical investigation, coordination with engineering/development teams, and following-up improvements;

Developing and refining observability across platforms (metrics/alerts, dashboards, logs);

Reducing operational toil: small automation, runbooks, and repeatable processes (the “make it easier next time” mindset);

Working with documentation set in the Atlassian ecosystem. This will include writing/updating KB, Runbooks, and other technical documentation;

Collaborating with development teams to improve production readiness (basic reliability practices, cleaner incident follow-ups).

So, why Gamingtec?

If you are a person with passion, ideas, and a thirst to advance your career, you will love our corporate culture. We are an international team that treats each other with respect and moves towards the same goals. We believe in freedom and flexibility and trust our employees to do their jobs in a way that works for them. We have an ambitious and rewarding work environment, a flat organisational structure and almost zero bureaucracy. Our employees’ ideas are what move the company forward. Everyone has equal opportunities in every aspect of work, learning and development!

Why you will love working here:

Being a part of an international team, where everyone treats each other with respect and moves towards the same goal;

Freedom and responsibilit

Middle Site Reliability Engineer (SRE)

Схожі вакансії

З блогу Trackr