Page MenuHomeMiraheze

Void
Steward/SREAdministrator

Projects (9)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Aug 8 2016, 22:25 (259 w, 4 d)
Roles
Administrator
Availability
Available
IRC Nickname
Voidwalker
GitHub User
The-Voidwalker
Miraheze User
Void [ Global Accounts ]

Recent Activity

Yesterday

Void committed rPUPC662f615519a8: prep for capacity upgrade (authored by Void).
prep for capacity upgrade
Thu, Jul 29, 21:22
Void added a comment to T7139: MediaWiki Capacity Proposal.

@Reception123 can you review my plans on T7676, T7677, and T7678? I'm setting up pull requests for if we do them, but should we decide to take another approach, that's fine as well.

Thu, Jul 29, 21:11 · MediaWiki (SRE)
Void added a comment to T7676: [New] Server Request for mw12 and mw13.

Regarding IPv4 addresses, we will either need to repurpose the IPv4 addresses from jobrunner* (which would leave jobchron1 and task1 with no IPv4 address), or we need to purchase an additional two IPv4 addresses for these servers.

Thu, Jul 29, 21:07 · Infrastructure (SRE)
Void added a comment to T7677: [New] Server Request for jobchron.

In theory we can recycle jobrunner3 to jobchron1. We'd need to remove the mediawiki and jobrunner roles, update hostname and DNS, and then adjust hardware settings. However, a downside to this approach will be that we will remain using a 47GB disk instead of 10GB. I don't believe we can resize this to be smaller.

Thu, Jul 29, 21:06 · Infrastructure (SRE)
Void added a comment to T7678: [New] Server Request for task.

I'm planning we recycle the existing jobrunner4 into this new task1 server. In theory, all we should need to do is update hostname and DNS, then a short restart to adjust hardware settings, and it should be good.

Thu, Jul 29, 21:04 · Infrastructure (SRE)
Void added a comment to T7713: Improve password standards.
In T7713#155094, @Void wrote:

I'm not certain I would reduce the maximum (currently 4096). In theory, the only problem with having a password longer than 128 characters is potential server load.

It comes from 2.1.1 and 2.1.2 of https://github.com/OWASP/ASVS/blob/v4.0.2/4.0/en/0x11-V2-Authentication.md#v21-password-security-requirements

Thu, Jul 29, 16:02 · Configuration, MediaWiki (SRE), Universal Omega, Trust & Safety
Void added a comment to T7713: Improve password standards.

I'm not certain I would reduce the maximum (currently 4096). In theory, the only problem with having a password longer than 128 characters is potential server load.

Thu, Jul 29, 14:19 · Configuration, MediaWiki (SRE), Universal Omega, Trust & Safety

Wed, Jul 28

Void updated the task description for T7710: Verify E-Mail address of user Swarup on https://worldsanskrit.net.
Wed, Jul 28, 22:05 · MediaWiki (SRE), MediaWiki
Void claimed T7676: [New] Server Request for mw12 and mw13.
Wed, Jul 28, 21:42 · Infrastructure (SRE)
Void claimed T7677: [New] Server Request for jobchron.
Wed, Jul 28, 21:42 · Infrastructure (SRE)
Void claimed T7678: [New] Server Request for task.
Wed, Jul 28, 21:42 · Infrastructure (SRE)
Void closed T7695: Loops does not properly limit loop count as Resolved.
Wed, Jul 28, 20:50 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void claimed T7695: Loops does not properly limit loop count.
Wed, Jul 28, 20:40 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void closed T7685: Change my Phabricator username to Emojiwiki, the same as Miraheze as Resolved.
Wed, Jul 28, 19:14 · Infrastructure (SRE), Phabricator
Void renamed Emojiwiki from Emojipedia to Emojiwiki.
Wed, Jul 28, 19:14
Void added a comment to T7711: Request Wiki restored.

Ping @Reception123. Looks to have been deleted in P344.

Wed, Jul 28, 19:09 · MediaWiki (SRE), MediaWiki
Void moved T7712: Security patch deployment system from Radar to Discussion on the Site Reliability Engineering board.
Wed, Jul 28, 19:01 · Goal-2021-Jul-Dec, MediaWiki, MediaWiki (SRE)
Void triaged T7712: Security patch deployment system as Low priority.
Wed, Jul 28, 19:01 · Goal-2021-Jul-Dec, MediaWiki, MediaWiki (SRE)
Void added a comment to R9:66ca12362dde: restrict access to loops.

Hello! Do you have an idea of when this extension is going to be re-enabled?

Wed, Jul 28, 17:11

Tue, Jul 27

Void added a comment to T7701: Rest.php OAUTH2 endpoint wrongly cached user identify endpoint.

@RhinosF1 per Owen's comments, this should wait until @Reception123 can review and approve.

Tue, Jul 27, 22:41 · MediaWiki (SRE), MediaWiki, Security
Void added a comment to T7701: Rest.php OAUTH2 endpoint wrongly cached user identify endpoint.

+1 on disclosure from me.

Tue, Jul 27, 21:17 · MediaWiki (SRE), MediaWiki, Security

Mon, Jul 26

Void added a comment to T7695: Loops does not properly limit loop count.

Has this been reported upstream?

Mon, Jul 26, 17:32 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void added a comment to T7695: Loops does not properly limit loop count.

I've disabled Loops on all wikis. Should probably have done it differently, will fix in a moment.

Mon, Jul 26, 06:00 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void added a comment to T7693: php-fpm workers keep running out.

Tentatively resolving based on private task, please reopen if issues persist.

Mon, Jul 26, 05:59 · MediaWiki, MediaWiki (SRE)
Void added a comment to T7695: Loops does not properly limit loop count.

Most likely cause of T7693

Mon, Jul 26, 05:53 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void created T7695: Loops does not properly limit loop count.
Mon, Jul 26, 05:53 · MediaWiki (SRE), Extensions, Security, Universal Omega
Void renamed T7693: php-fpm workers keep running out from Regex Functions can utilize 100% CPU to php-fpm workers keep running out.
Mon, Jul 26, 05:12 · MediaWiki, MediaWiki (SRE)
Void renamed T7693: php-fpm workers keep running out from php-fpm worker timeout consumes 100% CPU to Regex Functions can utilize 100% CPU.
Mon, Jul 26, 05:08 · MediaWiki, MediaWiki (SRE)
Void updated the task description for T7693: php-fpm workers keep running out.
Mon, Jul 26, 03:03 · MediaWiki, MediaWiki (SRE)
Void renamed T7693: php-fpm workers keep running out from php-fpm workers keep running out to php-fpm worker timeout consumes 100% CPU.
Mon, Jul 26, 03:01 · MediaWiki, MediaWiki (SRE)

Sat, Jul 24

Void closed T7689: Request to enable ConfirmEdit extension as Invalid.

Confirm edit is enabled on all wikis by default.

Sat, Jul 24, 02:00 · MediaWiki (SRE), MediaWiki

Fri, Jul 23

Void added a comment to T7687: Update of Cargo for calender export query bugfix.

This bug fix, as well as apparently a number of other changes, appear to only be available on the master branch of the extension. We should consider updating to that, as we currently only use the REL1_36 branch. Not sure if we'd need to take any special measures when updating though.

Fri, Jul 23, 22:38 · MediaWiki (SRE), Extensions, Universal Omega

Wed, Jul 21

Void closed T7626: redis-server is occasionally killed for OOM as Resolved.

Tentatively closing, looks like we've stabilized.

Wed, Jul 21, 20:43 · Infrastructure (SRE)
Void moved T7683: is there a way to auto-redirect URL calls without /wiki/ forward as if it were typed? from Incoming to Long Term on the Infrastructure (SRE) board.

This would be difficult to implement, I believe. I think (for mediawiki sites) we'd only have to worry about /wiki and /w, but I'd be worried about static.miraheze.org and probably a few other things.

Wed, Jul 21, 20:41 · MediaWiki (SRE), Infrastructure (SRE), NGINX, Varnish
Void added a comment to T7626: redis-server is occasionally killed for OOM.

Literally a setting in puppet: https://git.io/Jlvs7

Wed, Jul 21, 02:37 · Infrastructure (SRE)
Void lowered the priority of T7634: Alert on low php-fpm workers from High to Low.

Tracked passively in Grafana. Correct me if I'm mistaken, but anything short of an obvious DOS attack from one source wouldn't be actionable outside of provisioning additional MW servers? If that is the case, then there are other things that take precedence at the moment.

Wed, Jul 21, 02:25 · Infrastructure (SRE), Monitoring
Void added a comment to T7573: User can't confirm email.

Has Outlook been made aware of the status of the subtask (a general message would suffice)? Or have we had any other reports come in from Outlook?

Wed, Jul 21, 02:10 · Mail, Infrastructure (SRE)
Void lowered the priority of T7626: redis-server is occasionally killed for OOM from Unbreak Now! to High.

Effectively stalled on T7139 unless we can prevent jobrunner3 from accepting high intensity jobs (assuming RequestWikiAIJob).

Wed, Jul 21, 02:06 · Infrastructure (SRE)
Void closed T7647: cp3 down - 2021-07-16 09:00 UK time as Resolved.

Technically resolved on reboot. No cause has been identified, but as a not-recurring issue, it isn't worth spending more time investigating.

Wed, Jul 21, 02:04 · Varnish, Infrastructure (SRE)
Void added a parent task for T7678: [New] Server Request for task: T7139: MediaWiki Capacity Proposal.
Wed, Jul 21, 02:02 · Infrastructure (SRE)
Void added a parent task for T7677: [New] Server Request for jobchron: T7139: MediaWiki Capacity Proposal.
Wed, Jul 21, 02:02 · Infrastructure (SRE)
Void added subtasks for T7139: MediaWiki Capacity Proposal: T7677: [New] Server Request for jobchron, T7678: [New] Server Request for task.
Wed, Jul 21, 02:02 · MediaWiki (SRE)
Void created T7678: [New] Server Request for task.
Wed, Jul 21, 02:01 · Infrastructure (SRE)
Void created T7677: [New] Server Request for jobchron.
Wed, Jul 21, 01:58 · Infrastructure (SRE)
Void updated subscribers of T7676: [New] Server Request for mw12 and mw13.
Wed, Jul 21, 01:54 · Infrastructure (SRE)
Void placed T7676: [New] Server Request for mw12 and mw13 up for grabs.
Wed, Jul 21, 01:52 · Infrastructure (SRE)
Void added a parent task for T7676: [New] Server Request for mw12 and mw13: T7139: MediaWiki Capacity Proposal.
Wed, Jul 21, 01:52 · Infrastructure (SRE)
Void added a subtask for T7139: MediaWiki Capacity Proposal: T7676: [New] Server Request for mw12 and mw13.
Wed, Jul 21, 01:52 · MediaWiki (SRE)
Void created T7676: [New] Server Request for mw12 and mw13.
Wed, Jul 21, 01:52 · Infrastructure (SRE)
Void added a comment to T7139: MediaWiki Capacity Proposal.

Can we move forward with this now? Current tasks such as T7626 and T7633 indicate we cannot wait on expanding our infrastructure.

Wed, Jul 21, 01:08 · MediaWiki (SRE)
Void added a comment to T7145: Upgrade MediaWiki cluster to Debian Bullseye.

I'm removing parent tasks, as I don't think it is reasonable to wait for and test Debian Bullseye in our infrastructure given our current capacity problems.

Wed, Jul 21, 01:02 · MediaWiki (SRE)
Void removed a subtask for T7139: MediaWiki Capacity Proposal: T7145: Upgrade MediaWiki cluster to Debian Bullseye.
Wed, Jul 21, 00:59 · MediaWiki (SRE)
Void removed a parent task for T7145: Upgrade MediaWiki cluster to Debian Bullseye: T7139: MediaWiki Capacity Proposal.
Wed, Jul 21, 00:59 · MediaWiki (SRE)

Tue, Jul 20

Void closed T7599: Graylog search not working as Resolved.

Operating at 77% disk usage, so looks good. Feel free to reopen if icinga reports a disk usage warning, or you see any disk usage warning in graylog.

Tue, Jul 20, 23:30 · Monitoring, Infrastructure (SRE)
Void added a comment to T7626: redis-server is occasionally killed for OOM.

Can we try and temporarily disable the RequestWikiAIJob to see if this alleviates the load? Or alternately, could we prevent jobrunner3 from running these types of jobs?

Tue, Jul 20, 23:22 · Infrastructure (SRE)
Void lowered the priority of T7633: Persistent resource consumption is causing all sorts from Unbreak Now! to High.
Tue, Jul 20, 22:52 · MediaWiki (SRE), Cloud Infrastructure, Monitoring, MediaWiki
Void closed T7637: Provision mw12, a subtask of T7633: Persistent resource consumption is causing all sorts, as Declined.
Tue, Jul 20, 22:51 · MediaWiki (SRE), Cloud Infrastructure, Monitoring, MediaWiki
Void closed T7637: Provision mw12 as Declined.

Declining for now, we're set to expand resources soon, but don't have the capacity to do so immediately.

Tue, Jul 20, 22:51 · Infrastructure (SRE)
Void added a comment to T7666: LDAP guest user can send mail.

Follow up: https://github.com/miraheze/puppet/pull/1816

Tue, Jul 20, 00:28 · Infrastructure (SRE), Mail, Security

Mon, Jul 19

Void added a comment to T7593: DataDump is Vulnerable to CSRF Attacks.

Security advisory has been published, and CVE-2021-32774 was issued.

Mon, Jul 19, 21:36 · MediaWiki (SRE), DataDump, Security, Universal Omega
Void closed T7666: LDAP guest user can send mail, a subtask of T7573: User can't confirm email, as Resolved.
Mon, Jul 19, 19:28 · Mail, Infrastructure (SRE)
Void closed T7666: LDAP guest user can send mail as Resolved.

Scrambling the password has effectively solved this issue, but renders the guest account inaccessible for valid use cases (not that I think many users were using guest). We can follow up later with either restoring the account, or doing away with it entirely.

Mon, Jul 19, 19:28 · Infrastructure (SRE), Mail, Security
Void added a comment to T7666: LDAP guest user can send mail.

Confirmed guest account (the one we use for icinga - guest/guest) could be logged into on the mail server. I've scrambled that password for now (see K13). We'll have to see if this solves it. However, I also note that graylog suggests this has only been logged into today. It might not be the full cause, but it is the cause of those two pastes.

Mon, Jul 19, 17:51 · Infrastructure (SRE), Mail, Security
Void added a comment to T7665: Mass delete imported pages with nuke.

To clarify, Nuke does not show imported pages. This is an upstream bug, but I don't see it being fixed anytime soon, as it is over a decade old at this point.

Mon, Jul 19, 17:14 · Universal Omega, MediaWiki (SRE), MediaWiki

Sun, Jul 18

Void reassigned T7661: ManageWiki/Extensions inaccessible on Wikibase client wikis from Void to Universal_Omega.
Sun, Jul 18, 23:13 · ManageWiki, Universal Omega, Production Error, MediaWiki (SRE)
Void claimed T7661: ManageWiki/Extensions inaccessible on Wikibase client wikis.
Sun, Jul 18, 22:55 · ManageWiki, Universal Omega, Production Error, MediaWiki (SRE)
Void added a comment to T7125: Improve ManageWiki extension interface.

I think instead of having tabs on the form, we should have a filter that simply updates the visibility of the different items (I think we do something similar with the yearly Survey, where checking a checkbox makes more questions visible). This way we could default the page to showing all items, but also easily filter it down to categories. Additionally, depending on how it's implemented, it could display multiple categories at once, and the same item could be in multiple categories.

Sun, Jul 18, 19:43 · Universal Omega, ManageWiki, MediaWiki (SRE)
Void added a comment to T7626: redis-server is occasionally killed for OOM.

I have a process running on jobrunner3 that should report the full process information on any process that winds up getting killed by OOM. Hopefully it should tell us some more information about which processes are utilizing too much memory.

Sun, Jul 18, 03:59 · Infrastructure (SRE)

Sat, Jul 17

Void claimed T7626: redis-server is occasionally killed for OOM.
Sat, Jul 17, 21:43 · Infrastructure (SRE)
Void lowered the priority of T7599: Graylog search not working from High to Low.

Still monitoring this, but our storage usage is down to 10GB per day of logs from of 30GB per day. I think we can sustain this without difficulty.

Sat, Jul 17, 21:40 · Monitoring, Infrastructure (SRE)
Void added a comment to T7655: Varnish Rate Limit.

I'll note that loading in a large number of scripts at once (such as listing multiple scripts in your common.js) can cause this to happen. Particularly if those scrips create additional requests to the server.

Sat, Jul 17, 21:19 · Universal Omega, MediaWiki (SRE), Infrastructure (SRE), Varnish
Void added a comment to T7654: I can view Administrator revdel edits as a user.

Flow is awkward, but if you're referring to Topic:Wcwif8hjydot2c1y, the post in question was hidden, not deleted. It therefore can be viewed by anyone with the flow-hide permission, which is everyone on the wiki.

Sat, Jul 17, 18:08 · MediaWiki (SRE), MediaWiki
Void added a comment to T7652: grimsiblingswiki broken.

Adding mahjongwiki as well

Sat, Jul 17, 01:06 · MediaWiki (SRE)
Void merged T7338: Investigate cause of wiki being created but not creation farmer log entry being created into T7626: redis-server is occasionally killed for OOM.
Sat, Jul 17, 01:05 · Infrastructure (SRE)
Void merged task T7338: Investigate cause of wiki being created but not creation farmer log entry being created into T7626: redis-server is occasionally killed for OOM.
Sat, Jul 17, 01:05 · CreateWiki, Universal Omega, MediaWiki (SRE)
Void raised the priority of T7626: redis-server is occasionally killed for OOM from High to Unbreak Now!.

Jobrunner3 is showing 155 Out of memory issues in the past 24 hours, killing several processes, including redis repeatedly.

Sat, Jul 17, 00:57 · Infrastructure (SRE)
Void triaged T7652: grimsiblingswiki broken as Normal priority.
Sat, Jul 17, 00:55 · MediaWiki (SRE)

Fri, Jul 16

Void lowered the priority of T7647: cp3 down - 2021-07-16 09:00 UK time from Unbreak Now! to Normal.

There's no indication of a cause anywhere in the cp3 logs. How was the outage reported and verified?

Fri, Jul 16, 23:48 · Varnish, Infrastructure (SRE)
Void added a comment to T7373: Investigate cause of redis server error (socket error on read socket) when CreateWiki Extension creates a wiki.

Could be T7626?

Fri, Jul 16, 03:55 · Universal Omega, MediaWiki (SRE), MediaWiki

Tue, Jul 13

Void added a comment to T7637: Provision mw12.

We don't have any available IPv4 addresses, so this may have to wait. Unless we want an MW server to be only available over IPv6. In any case, I'm not actually available for the rest of today, so this would be done tomorrow at the earliest.

Tue, Jul 13, 23:05 · Infrastructure (SRE)
Void moved T7626: redis-server is occasionally killed for OOM from Incoming to Short Term on the Infrastructure (SRE) board.
Tue, Jul 13, 22:11 · Infrastructure (SRE)
Void moved T7573: User can't confirm email from Short Term to External on the Infrastructure (SRE) board.
Tue, Jul 13, 22:11 · Mail, Infrastructure (SRE)
Void moved T7605: Convert puppet .service files to override.conf files from Incoming to Short Term on the Infrastructure (SRE) board.
Tue, Jul 13, 22:10 · Infrastructure (SRE), Puppet
Void moved T7637: Provision mw12 from Incoming to Short Term on the Infrastructure (SRE) board.
Tue, Jul 13, 22:09 · Infrastructure (SRE)
Void added a subtask for T7633: Persistent resource consumption is causing all sorts: T7637: Provision mw12.
Tue, Jul 13, 22:08 · MediaWiki (SRE), Cloud Infrastructure, Monitoring, MediaWiki
Void added a parent task for T7637: Provision mw12: T7633: Persistent resource consumption is causing all sorts.
Tue, Jul 13, 22:08 · Infrastructure (SRE)
Void triaged T7637: Provision mw12 as High priority.
Tue, Jul 13, 22:08 · Infrastructure (SRE)
Void lowered the priority of T7634: Alert on low php-fpm workers from Unbreak Now! to High.
Tue, Jul 13, 22:06 · Infrastructure (SRE), Monitoring
Void closed T7467: [Access Request] Void as Resolved.

Have access now to OVH/RN/Proxmox. I believe that's everything.

Tue, Jul 13, 22:05 · Site Reliability Engineering
Void added a comment to T7599: Graylog search not working.

FYI if anyone needs to get graylog working again, go to System > Indices > Default index set, and delete the oldest indexes (oldest at bottom) until there is at least 15% disk space available.

Tue, Jul 13, 14:17 · Monitoring, Infrastructure (SRE)
Void lowered the priority of T7599: Graylog search not working from Unbreak Now! to High.

Cleared some disk space, will need to monitor if our recent changes are sufficient to prevent any further issues.

Tue, Jul 13, 14:15 · Monitoring, Infrastructure (SRE)
Void moved T7599: Graylog search not working from Incoming to Short Term on the Infrastructure (SRE) board.
Tue, Jul 13, 01:21 · Monitoring, Infrastructure (SRE)
Void added a comment to T7599: Graylog search not working.

Created https://github.com/miraheze/mw-config/pull/3997, but not yet willing to merge. Thoughts?

Tue, Jul 13, 01:21 · Monitoring, Infrastructure (SRE)

Sun, Jul 11

Void triaged T7626: redis-server is occasionally killed for OOM as High priority.
Sun, Jul 11, 21:49 · Infrastructure (SRE)
Void added a comment to T7509: Change CAPTCHA to ReCaptcha v3.

@Void From what I understand from @Universal Omega it seems like if you don't pass the CAPTCHA instead of it telling you that it tells you that you put the wrong password. Do you have any idea how to change that in our MirahezeMagic version in order to get a different message?

Sun, Jul 11, 18:06 · MediaWiki (SRE), Extensions, Universal Omega
Void added a comment to T7619: A suggestion for a default global extension for the Miraheze farm.

According to upstream, the bug in T1404 appears to still be present. Enabling this extension globally is therefore very likely to cause rendering inconsistencies in VisualEditor.

Sun, Jul 11, 18:04 · Extensions, Universal Omega, MediaWiki (SRE)

Thu, Jul 8

Void triaged T7605: Convert puppet .service files to override.conf files as Normal priority.
Thu, Jul 8, 04:02 · Infrastructure (SRE), Puppet
Void created T7605: Convert puppet .service files to override.conf files.
Thu, Jul 8, 04:02 · Infrastructure (SRE), Puppet
Void changed the visibility for T7593: DataDump is Vulnerable to CSRF Attacks.
Thu, Jul 8, 01:48 · DataDump, MediaWiki (SRE), Security, Universal Omega
Void closed T7593: DataDump is Vulnerable to CSRF Attacks as Resolved.
Thu, Jul 8, 01:46 · DataDump, MediaWiki (SRE), Security, Universal Omega