Grafana dashboard: https://grafana.miraheze.org/d/pfjAbhf7k/mediawiki-slos
- JobQueue
- SLO: Availability to submit/run jobs is at least 99.5%. SLI: Service uptime.
- SLO: Errors abandoned jobs are less than 1.5% of jobs over 1 day. SLI: Abandoned jobs
- Memcached
- SLO: Availability of Memcached to be at least 99.5%. SLI: Service uptime.
- MediaWiki
- SLO: Availability of MediaWiki to be at least 99% SLI: Nginx 50x responses / total requests
- SLO: Errors must account for less than 3% of requests SLI: Errors v total hits
- SLO: Latency backend response times to be below 3s. SLI: Nginx request time average