Senior Site Reliability Engineer العربية

Senior Site Reliability Engineer

Canonical

Posted on : 12-01-2025

Employer Active

1 Vacancy

The job posting is outdated and position may be filled

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Dubai - UAE

Salary

Not Disclosed

Salary Not Disclosed

Nationality

Emirati

Gender

Male

Vacancy

1 Vacancy

Posted on : 12-01-2025

Job Description

Roles and responsibilities

Our cloud operations engineers bring Python software-engineering skills and rigour to the operations domain. We practise devsecops from bare metal to application. We architect and run OpenStack, Kubernetes and software defined storage, and we enable devsecops for applications running on that infrastructure too.

To become a member of this team, you need to be a software engineer fluent in Python, you need a genuine interest in the full open source infrastructure stack from metal to containers, and you need the ability to work in a high pressure operations environment with mission-critical services for global brand name customers.

As a member of the team you will gain experience in a broad range of cloud technologies. We evolve our offerings as the state of the art improves, so you get to stay current with the latest capabilities in open source infrastructure. We drive upgrades to keep our customers on the latest, best solutions.

What we are looking for in you

Degree in Software Engineering or Computer Science
Experience with Linux and familiarity with Linux networking and storage
Python software development expertise
Operational experience
Excellent interpersonal skills, curiosity, flexibility, and accountability
Ability to travel internationally twice a year, for company events up to two weeks long

Nice-to-have skills

Experience with OpenStack or Kubernetes deployment or operations
Familiarity with public or private cloud management

What we offer colleagues

We consider geographical location, experience, and performance in shaping compensation worldwide. We revisit compensation annually (and more often for graduates and associates) to ensure we recognise outstanding performance. In addition to base pay, we offer a performance-driven annual bonus or commission. We provide all team members with additional benefits, which reflect our values and ideals. We balance our programs to meet local needs and ensure fairness globally.

Distributed work environment with twice-yearly team sprints in person
Annual compensation review
Recognition rewards
Annual holiday leave
Maternity and paternity leave
Employee Assistance Programme
Opportunity to travel to new locations to meet colleagues
Priority Pass, and travel upgrades for long haul company events

Desired candidate profile

1. Reliability Engineering

Availability and Performance: Ensure that the systems, applications, and services are highly available and performant. Monitor uptime, response times, and system health, taking proactive steps to address potential issues.
Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Define, monitor, and maintain SLOs, SLIs, and SLAs for various services, ensuring that reliability goals are met and exceeded.
Incident Management and Resolution: Act as an escalation point for incidents, diagnose and resolve production issues in real-time, and lead postmortem analysis to prevent future occurrences.
Capacity Planning and Scalability: Ensure that systems are capable of scaling to meet increased demand, using tools like load balancing, auto-scaling, and horizontal scaling techniques.

2. Automation and Infrastructure Management

Infrastructure as Code (IaC): Write and manage infrastructure code (using tools like Terraform, Ansible, or CloudFormation) to automate the provisioning, configuration, and management of cloud resources and infrastructure.
CI/CD Pipelines: Build, improve, and maintain Continuous Integration and Continuous Deployment pipelines, ensuring that software is deployed quickly, safely, and reliably.
Configuration Management: Use automation tools to manage system configurations and deployments, reducing manual intervention and improving consistency.
Monitoring and Alerting: Implement and maintain comprehensive monitoring and alerting systems to detect issues early. Use tools like Prometheus, Grafana, ELK stack, Datadog, or New Relic to ensure the systems' health is constantly tracked.

3. Collaboration with Development Teams

DevOps Practices: Work closely with development teams to bridge the gap between software development and operations. Implement best practices for building and running software in production environments, promoting a culture of DevOps.
Code Review and Guidance: Participate in code reviews, providing feedback on application code, infrastructure code, and architectural decisions to improve reliability and maintainability.
Incident Response: Work alongside development teams to identify root causes of incidents, recommend fixes, and ensure future incidents are prevented through improved practices.

4. Security and Compliance

Security Best Practices: Ensure that the systems are secure by following best practices in securing infrastructure, network, and applications. This includes managing access control, encryption, and vulnerability management.
Compliance: Ensure the systems comply with relevant standards and regulations (e.g., PCI DSS, HIPAA, GDPR) and that security measures are in place to meet compliance requirements.
Disaster Recovery and Business Continuity: Design and implement disaster recovery plans and business continuity procedures, ensuring that critical systems can be restored quickly in the event of a failure.

Employment Type

Full-time

Company Industry

Accounting

Department / Functional Area

Engineering

Key Skills

Apply Now

About Company

Canonical

Report This Job

Disclaimer: Drjobs.ae is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Senior Site Reliability Engineer

Canonical

Job Description

Roles and responsibilities

Desired candidate profile

Employment Type

Company Industry

Department / Functional Area

Key Skills

About Company

Similar Jobs

Technical Support Engineer End User Support

Production and Service Engineer – Instrumentation

Senior Project Controls Manager

Sr. HSE Engineer Marine & Infrastructure Works

Senior Quality Assurance (QA) Engineer

Planning Engineer

System Engineer

Lead Engineer Metallurgy