Site Reliability Engineer

  • Full Time
  • United Kingdom
  • TBD USD / Year




  • Job applications may no longer be accepted for this opportunity.


Capital on Tap

Running a business is hard, we make it easier. We provide an all-in-one business credit card & spend management platform, built for SMEs. Over 200,000 small businesses have spent more than £8 billion on their Capital on Tap Business Credit Cards.

The Role

Our Site Reliability Engineers work closely with our Platform and Engineering teams to ensure our applications are designed and built with reliability and speed in mind as well as ensuring our application infrastructure is robust and scalable.

As a Site Reliability Engineer at Capital on Tap you will be responsible for designing, building, and monitoring systems to maximise platforms uptime and efficiency for the best possible end-user experience. You are also tasked with identifying and resolving potential outages and performance issues before they become a problem.

Responsibilities

  • Manage Azure services and resources, Cloudflare edge security, and traffic management in code.
  • Create, manage, and monitor development resources within Kubernetes clusters and Serverless (i.e. Function Apps, Automation Accounts) for Product Engineering Teams.
  • Own Terraform / Ansible / Pulumi Infrastructure as Code for each Product Engineering team.
  • Continuously identify opportunities for improvement in systems, processes, and technologies, and implement changes to improve the overall reliability and performance of the platforms.
  • Improve monitoring to provide insights into uptime and availability, and work towards the agreed SLO.
  • Work with the Product team to identify the company SLA and objectives for all core services/applications.
  • Work with Platform Engineers to deliver end-to-end automated solutions and pipelines.
  • Work with software developers and stakeholders to improve the user experience through pipeline management and infrastructure improvements.
  • Proactively support Platform services and tooling (TeamCity, Octopus, Azure DevOps & more to come)
  • Improve reliability, quality, and time-to-market of our suite of software solutions. Through solutions such as load testing, chaos engineering and improved deployment strategies.
  • Own and lead the troubleshooting of incidents that impact the customer experience.

This role can be a Remote (UK) or Hybrid role, based from our London Offices 1-2 days per week

About you

  • Experience in managing public cloud processes
  • Experience in Azure DevOps, Octopus, and other CI/CD tools
  • Experience in Powershell, Bash, or other scripting languages
  • Experience with Terraform
  • Experience working with a cloud monitoring solution (advantageous to have DataDog)
  • Experience with Kubernetes and Docker (advantageous)

What you’ll get in return

? Private Healthcare through Vitality

✈️ Worldwide travel insurance through Vitality

? Anniversary Rewards (£250, £500, £750, 4-week fully paid sabbatical)

? Salary Sacrifice Pension Scheme 4-7% match

?️ 28 days holiday (plus bank holidays)

? Annual Learning Budget

? Enhanced Parental Leave

? Cycle to Work Scheme

? Season Ticket Loan

? 6 free therapy sessions per year

? Dog Friendly Offices

? Free drinks and snacks in our Offices

Check out more of our benefits, values, and mission here.

If you want to keep updated on future opportunities at Capital on Tap follow our company page here.

Diversity and Inclusion

We want to be a place where a diverse mix of talented people want to come and do their best work and most importantly feel included and that they can be their authentic selves. We welcome, consider, and encourage applications from anyone who shares this passion.

How to Apply

If you want to progress your career within a fast-growing, profitable fintech then click apply (all we ask for is your CV and contact details) and we will get back to you within 3 working days

To apply for this job please visit grnh.se.

Scroll to Top