Page MenuHomeMiraheze

Infrastructure - Mail - SLO Error Failure
Closed, ResolvedPublic

Description

For December 2022 SLO Reporting - Mail failed the SLO for Errors.

The SLO agreed was: 1%.
The Performance achieve was: 1.71%.

Please investigate the reasons behind not meeting the SLO and provide a clear summary on this task identifying whether:

  • the failure was transient due to factors outside of the teams control, or
  • the failure was preventable and clear steps have been taken to investigate and implement controls to minimise the risk of failing in January 2023.

Event Timeline

John triaged this task as Normal priority.Dec 30 2022, 22:33
John created this task.
John claimed this task.

This has been fixed. This was generating around 1440 failures a day - in order to meet the error threshold with these numbers, we'd need to have sent 144000 emails a day, or 100 a minute. As we don't operate at these volumes, this was always going to be the case.

I'll keep an eye on this over the next few weeks but we should start to see the error rate come down and be more manageable for next month.