Job DescriptionWe’re looking for a talented Site Reliability Engineer (SRE) to keep our systems running smoothly, reliably, and at scale. Through smart automation, deep observability, and a calm head in a crisis, you’ll help us balance speed, compliance, and stability, working alongside DevOps, Cloud, Quality Engineering, and Product teams to drive continuous improvements in performance, security, and resilience. Experience and Qualifications• 5+ years of experience in SRE or DevOps roles, building and managing large-scale, high-availability systems across banking, fintech, e-commerce, or other data-intensive digital ecosystems.• Bachelor’s degree in Computer Science or equivalent technical experience.• Strong experience with Linux environments and performance troubleshooting.• Proven expertise in Terraform and Infrastructure as Code (IaC) methodologies.• Proficiency with Kubernetes and container orchestration in microservices environments.• Hands-on experience with AWS (preferred); exposure to Azure or GCP is an advantage.• Deep knowledge of Dynatrace (AIOps, Davis AI), Prometheus, Grafana, and the ELK stack.• Experience implementing AI / ML-driven reliability or automation solutions (AIOps, anomaly detection, predictive alerting).• Practical understanding of CI / CD pipelines (GitHub Actions, Jenkins, GitLab CI / CD or Azure DevOps).• Experience with Kafka, RabbitMQ, Redis, Aurora, and RDS databases.• Strong scripting or programming skills in Python, Bash, or Go