Page MenuHomeMiraheze

Site Reliability EngineeringGroup
ActivePublic

Members (7)

Watchers (2)

Details

Description

Project for issues which require a Site Reliability Engineer to do something or generalised work affecting servers, networking, new services or major puppet/DNS changes.

Recent Activity

Today

R4356th added a comment to T6232: wgRightsIcon for GPLv3 blocked by CSP.

I cannot be sure but I have a feeling that the image was being loaded properly.

Sun, Sep 27, 07:57 · Site Reliability Engineering, Universal Omega, Configuration

Yesterday

RhinosF1 added a comment to T6232: wgRightsIcon for GPLv3 blocked by CSP.

Most browsers had an update quite a long time ago which is causing this issue. An example wiki affected by this- Snap! Wiki.

Nothing to do with a browser update, it's our own bad config.

This behaviour was not there before.

Sat, Sep 26, 21:35 · Site Reliability Engineering, Universal Omega, Configuration
RhinosF1 updated the task description for T6232: wgRightsIcon for GPLv3 blocked by CSP.
Sat, Sep 26, 21:08 · Site Reliability Engineering, Universal Omega, Configuration
R4356th added a comment to T6232: wgRightsIcon for GPLv3 blocked by CSP.

Most browsers had an update quite a long time ago which is causing this issue. An example wiki affected by this- Snap! Wiki.

Nothing to do with a browser update, it's our own bad config.

Sat, Sep 26, 21:08 · Site Reliability Engineering, Universal Omega, Configuration
RhinosF1 added a comment to T6232: wgRightsIcon for GPLv3 blocked by CSP.

Most browsers had an update quite a long time ago which is causing this issue. An example wiki affected by this- Snap! Wiki.

Sat, Sep 26, 21:07 · Site Reliability Engineering, Universal Omega, Configuration
RhinosF1 renamed T6232: wgRightsIcon for GPLv3 blocked by CSP from HTML Bug Violating Content Security Policy to wgRightsIcon for GPLv3 blocked by CSP.
Sat, Sep 26, 21:06 · Site Reliability Engineering, Universal Omega, Configuration
Onmp314 added a comment to T6229: Upgrade to MediaWiki 1.35.

I support this idea

Sat, Sep 26, 07:47 · Site Reliability Engineering, MediaWiki
Reception123 triaged T6229: Upgrade to MediaWiki 1.35 as High priority.
Sat, Sep 26, 06:16 · Site Reliability Engineering, MediaWiki

Thu, Sep 24

Zppix closed T5614: Whitelist imgbox.com and googleusercontent.com so images can be shown as Resolved.

Seems to be resolved now. If not feel free to reopen.

Thu, Sep 24, 16:13 · Puppet, Site Reliability Engineering
RhinosF1 added a comment to T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.

causing icinga to warn because nginx access logs were filled with 429s.

It was doing this with the old system anyway on occasion. I assume you mean more frequently. Are 429s something we can exclude from the check? Would that be better?

Thu, Sep 24, 09:03 · Site Reliability Engineering, Universal Omega, Extensions
RhinosF1 added a comment to T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.

It seems this problem will prolong indefinitely or should we expect a solution soon?

@Paladox and @Southparkfan are working to deal with this. The rate limiter will probably exist for the foreseeable future unless we are happy are resources can cope with traffic that was causing issues (DDoS/SQLis) but we are working to improve the solution so it doesn't affect legitimate traffic.

Thu, Sep 24, 09:02 · Site Reliability Engineering, Universal Omega, Extensions
Nomalias added a comment to T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.

It seems this problem will prolong indefinitely or should we expect a solution soon? Thank you (ps: some days ago I was able to load my page without problem but then when back to the same, was the rate limiter lifted at some point?)

Thu, Sep 24, 08:59 · Site Reliability Engineering, Universal Omega, Extensions

Wed, Sep 23

Zppix added a comment to T5614: Whitelist imgbox.com and googleusercontent.com so images can be shown.

Done should take effect in approx. 10 minutes

Wed, Sep 23, 14:38 · Puppet, Site Reliability Engineering
DeltaRuneFan2001 changed the status of T4778: Change database name and URL of animatedfeet.miraheze.org from Stalled to Open.
Wed, Sep 23, 03:40 · Site Reliability Engineering
DeltaRuneFan2001 lowered the priority of T4778: Change database name and URL of animatedfeet.miraheze.org from Normal to Low.
Wed, Sep 23, 03:40 · Site Reliability Engineering
DeltaRuneFan2001 reopened T4778: Change database name and URL of animatedfeet.miraheze.org as "Stalled".
Wed, Sep 23, 03:40 · Site Reliability Engineering
DeltaRuneFan2001 added a comment to T4778: Change database name and URL of animatedfeet.miraheze.org.

Why did you close it, it hasn't even been completed yet. I'm planning on reopening this failed/rejected task. And yes you can make the change now.

Wed, Sep 23, 03:39 · Site Reliability Engineering

Tue, Sep 22

Revival added a comment to T5614: Whitelist imgbox.com and googleusercontent.com so images can be shown.

Sorry to interrupt, currently the following domains are whitelisted:
url52: 'googleusercontent.com'
url53: 'imgbox.com'

Tue, Sep 22, 07:22 · Puppet, Site Reliability Engineering
Revival added a comment to T5614: Whitelist imgbox.com and googleusercontent.com so images can be shown.

lh3.googleusercontent.com is now whitelisted.

Tue, Sep 22, 07:07 · Puppet, Site Reliability Engineering

Mon, Sep 21

Reception123 edited projects for T367: Move @NDKilla to Miraheze/operations, added: Site Reliability Engineering; removed Trash.
Mon, Sep 21, 18:21 · Site Reliability Engineering
RhinosF1 added a subtask for T5580: Setup a wiki for LDAP: T6204: Security Review LDAP Stack.
Mon, Sep 21, 17:36 · Site Reliability Engineering
Paladox added a comment to T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.

I've had to revert my change as it was causing icinga to warn because nginx access logs were filled with 429s. Though I did raise the limit to 50r/s still not enough.

Mon, Sep 21, 15:23 · Site Reliability Engineering, Universal Omega, Extensions

Sun, Sep 20

MacFan4000 updated the task description for T6186: Create noreply-bots@miraheze.org email address.
Sun, Sep 20, 21:17 · Mail, Site Reliability Engineering
Dmehus updated subscribers of T5976: Add the EventStreams service.

Ah, so that might be the main issue here, the RAM usage? Perhaps a discussion with @Owen if we have room in the budget to increase the RAM allocated to one or more of our servers? If the budget is too tight, then I guess this is dependent on a rather significant donation from another benefactor?

Sun, Sep 20, 18:59 · Site Reliability Engineering
MacFan4000 updated subscribers of T6186: Create noreply-bots@miraheze.org email address.
Sun, Sep 20, 18:40 · Mail, Site Reliability Engineering
Paladox claimed T5580: Setup a wiki for LDAP.
Sun, Sep 20, 16:47 · Site Reliability Engineering
Paladox lowered the priority of T5580: Setup a wiki for LDAP from High to Normal.
Sun, Sep 20, 16:41 · Site Reliability Engineering
Paladox raised the priority of T5580: Setup a wiki for LDAP from Normal to High.
Sun, Sep 20, 16:36 · Site Reliability Engineering
Reception123 added a comment to T5976: Add the EventStreams service.

This does seem to require a lot of RAM, so not sure we can do this at this stage.

Sun, Sep 20, 16:29 · Site Reliability Engineering
MacFan4000 added a comment to T5976: Add the EventStreams service.

This would be useful for certain bots as well. For example: SignBot requires this. (https://phab.bots.miraheze.wiki/T86)

Sun, Sep 20, 16:21 · Site Reliability Engineering
Southparkfan added a comment to T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.

Waiting for response from @Paladox regarding nginx changes.

Sun, Sep 20, 13:01 · Site Reliability Engineering, Universal Omega, Extensions
Southparkfan closed T6056: 18/19-08-2020 cp* failures as Resolved.

Seems resolved.

Sun, Sep 20, 13:00 · Amanda Catherine, Site Reliability Engineering
Southparkfan added a comment to T5877: Revise MariaDB backup strategy.

Contacted Owen for a data processing agreement for the free infra offers.

Sun, Sep 20, 13:00 · Site Reliability Engineering, Database, Goal-2020-Jul-Dec
RhinosF1 reopened T4976: CSP whitelist request for www.desmos.com, a subtask of T5092: Create a CSP whitelist policy, as Open.
Sun, Sep 20, 12:37 · Site Reliability Engineering

Thu, Sep 17

BlackWidowMovie0000 closed T6192: Timeless Content Bar as Resolved.
Thu, Sep 17, 19:05 · Universal Omega, Extensions, Upstream
BlackWidowMovie0000 reopened T6192: Timeless Content Bar as "Open".
Thu, Sep 17, 18:55 · Universal Omega, Extensions, Upstream

Wed, Sep 16

Universal_Omega triaged T6186: Create noreply-bots@miraheze.org email address as Normal priority.
Wed, Sep 16, 03:34 · Mail, Site Reliability Engineering
MacFan4000 updated the task description for T6186: Create noreply-bots@miraheze.org email address.
Wed, Sep 16, 00:35 · Mail, Site Reliability Engineering
MacFan4000 created T6186: Create noreply-bots@miraheze.org email address.
Wed, Sep 16, 00:35 · Mail, Site Reliability Engineering

Sun, Sep 13

Reception123 merged T6174: Math Formula Rendering Error Occurs into T6146: Varnish/Nginx? returning 429 due to DDoS/SQLi mitigations when rendering Math Images in some cases.
Sun, Sep 13, 12:48 · Site Reliability Engineering, Universal Omega, Extensions
Dmitryf merged task T6172: 503 Backend fetch failed into T6171: 503 Backend fetch failed (again!).
Sun, Sep 13, 07:56 · Site Reliability Engineering
Dmitryf added a comment to T6172: 503 Backend fetch failed.

Sorry, when I edited, it reopened it.

Sun, Sep 13, 07:55 · Site Reliability Engineering
EoflaOE added a comment to T6172: 503 Backend fetch failed.

It happened to me too.

Sun, Sep 13, 07:47 · Site Reliability Engineering
Reception123 triaged T6172: 503 Backend fetch failed as Unbreak Now! priority.
Sun, Sep 13, 07:46 · Site Reliability Engineering

Wed, Sep 9

Reception123 added a project to T6161: Have dependabot auto suggest submodule updates: Extensions.
Wed, Sep 9, 19:49 · Extensions, Universal Omega, Site Reliability Engineering
Reception123 triaged T6161: Have dependabot auto suggest submodule updates as Low priority.
Wed, Sep 9, 19:49 · Extensions, Universal Omega, Site Reliability Engineering
Paladox closed T6094: gluster servers running out of space as Resolved.

It's rebalancing so should hopefully reduce gluster1 disk usage (thus resolving that warning).

Wed, Sep 9, 15:26 · Site Reliability Engineering

Tue, Sep 8

Paladox added a comment to T6073: Resolve google detected 5xx errors.

Fixed the phab issue with https://github.com/miraheze/puppet/commit/8ea25cf319be946116166fc2635e61e2e1b6fe64

Tue, Sep 8, 23:22 · MediaWiki, Production Error, Site Reliability Engineering
Paladox closed T6093: Catch dns.resolver.NoAnswer properly inside reverse DNS check as Resolved.

Fixed with https://github.com/miraheze/puppet/pull/1491

Tue, Sep 8, 23:13 · Site Reliability Engineering, Monitoring
Paladox updated the task description for T6012: Update to Debian Buster 10.5.
Tue, Sep 8, 23:06 · Site Reliability Engineering, Security