Page MenuHomeMiraheze
Feed Advanced Search

Thu, Mar 28

Universal_Omega closed T11987: SSL script broke, not committing public keys as Resolved.

I have switched the SSL bot to the WikiTideSSLBot account. Please let me know if there are any issues with certs now.

Thu, Mar 28, 18:03 · Infrastructure (SRE), SSL
MacFan4000 added a comment to T11987: SSL script broke, not committing public keys.

It would be great if this could be looked into as SSL requests are starting to pile up.

Thu, Mar 28, 15:47 · Infrastructure (SRE), SSL

Tue, Mar 26

OrangeStar closed T11851: check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain as Declined.

Using RDAP (preferably) or WHOIS is a better solution for these kinds of issues.

Tue, Mar 26, 17:49 · SRE Automation, Monitoring, SSL, Infrastructure (SRE)

Mon, Mar 25

Universal_Omega added a comment to T11987: SSL script broke, not committing public keys.

At this point I think our only option may be to switch to a new account if we can't retrieve access to MirahezeSSLBot. But we'll see if we can find a recovery code or something first. I believe that is still controlled by John and we should retrieve access somehow.

Mon, Mar 25, 03:28 · Infrastructure (SRE), SSL
MacFan4000 triaged T11987: SSL script broke, not committing public keys as High priority.
Mon, Mar 25, 00:34 · Infrastructure (SRE), SSL

Sun, Mar 24

MacFan4000 created T11987: SSL script broke, not committing public keys.
Sun, Mar 24, 22:41 · Infrastructure (SRE), SSL
Universal_Omega lowered the priority of T8845: Allow Icinga to generate Phorge tasks for Critical alerts from Normal to Low.
Sun, Mar 24, 06:26 · Phorge, Monitoring, Infrastructure (SRE)
Universal_Omega lowered the priority of T8847: Icinga docs entries for all Infrastructure monitoring from Normal to Low.
Sun, Mar 24, 06:26 · Documentation, Monitoring, Infrastructure (SRE)
Universal_Omega changed the status of T8847: Icinga docs entries for all Infrastructure monitoring from Open to In progress.
Sun, Mar 24, 06:24 · Documentation, Monitoring, Infrastructure (SRE)
Universal_Omega lowered the priority of T11680: Create Miraheze/python-functions github repo & python package from Normal to Low.
Sun, Mar 24, 06:20 · Infrastructure (SRE), SRE Automation

Sat, Mar 23

Universal_Omega added a comment to T11275: API Requests to Wikibase Repositories are blocked.

This has been open for a while, I thought someone was going to come up with some idea to do this automatically but I guess not.

Let's keep it simple then. I propose we just have an array of wikis with wikibase client, and wikis those wikis have to contact, and send the appropiate CORS headers. Those that want to do the same thing as @Redmin here will have to open a phab task here for their wikis to be added to this array. This would be done in https://github.com/miraheze/puppet/blob/master/modules/varnish/templates/default.vcl#L231. Sound good to the SRE team?

Sat, Mar 23, 06:47 · Infrastructure (SRE)
Reception123 closed T11909: Mention Special:ManageWiki/extensions on the Feature Request Maniphest form as Resolved.

Will make the change now, though I think what is really needed is a wider reorganization of https://meta.miraheze.org/wiki/Request_features and the current forms we have on Phorge. Especially since eventually there will be very few tasks like imports (even images) that will still be done on Phorge. But it's probably worth waiting until more stuff is automated (at least images I'd say).

Sat, Mar 23, 06:19 · Infrastructure (SRE), Documentation, Phorge
Universal_Omega added a project to T11909: Mention Special:ManageWiki/extensions on the Feature Request Maniphest form: Infrastructure (SRE).
Sat, Mar 23, 06:17 · Infrastructure (SRE), Documentation, Phorge
Universal_Omega lowered the priority of T11744: Create db162 (or db172?) and migrate core databases there from Normal to Low.
Sat, Mar 23, 06:06 · Infrastructure (SRE), Database
Universal_Omega lowered the priority of T11730: Rebalance database servers from Normal to Low.
Sat, Mar 23, 06:06 · Infrastructure (SRE), Database
Universal_Omega claimed T8847: Icinga docs entries for all Infrastructure monitoring.
Sat, Mar 23, 06:03 · Documentation, Monitoring, Infrastructure (SRE)
Universal_Omega claimed T8845: Allow Icinga to generate Phorge tasks for Critical alerts.
Sat, Mar 23, 06:02 · Phorge, Monitoring, Infrastructure (SRE)
Universal_Omega renamed T8845: Allow Icinga to generate Phorge tasks for Critical alerts from Allow Icinga to generate Phabricator tasks for Critical alerts to Allow Icinga to generate Phorge tasks for Critical alerts.
Sat, Mar 23, 06:02 · Phorge, Monitoring, Infrastructure (SRE)

Mar 14 2024

Universal_Omega added a project to T11925: OrangeStar's LDAP account & Graylog access: Infrastructure (SRE).
Mar 14 2024, 19:32 · Infrastructure (SRE), Security

Mar 9 2024

Reception123 lowered the priority of T11934: Request for a TSPortal test server from Normal to Low.
Mar 9 2024, 08:00 · Infrastructure (SRE)
Universal_Omega added a comment to T11934: Request for a TSPortal test server.

I will consider this request after talking to a few others in both SRE and T&S and to find out what the future of TSPortal is anyway. As a T&S member I think it helps for DPA primarily the other aspects of it are pretty buggy sometimes and may not be really worth maintaining if we choose not to utilize it and in that case this request may be unnecessary...

Mar 9 2024, 06:52 · Infrastructure (SRE)
Collei triaged T11934: Request for a TSPortal test server as Normal priority.
Mar 9 2024, 06:49 · Infrastructure (SRE)
Collei triaged T11940: Emails for password reset/account creation/email confirmation not sending as High priority.
Mar 9 2024, 06:48 · MediaWiki (SRE), MediaWiki

Mar 8 2024

MacFan4000 updated subscribers of T11940: Emails for password reset/account creation/email confirmation not sending.
Mar 8 2024, 22:32 · MediaWiki (SRE), MediaWiki
MacFan4000 added a project to T11940: Emails for password reset/account creation/email confirmation not sending: Infrastructure (SRE).
Mar 8 2024, 22:31 · MediaWiki (SRE), MediaWiki

Mar 6 2024

OrangeStar created T11934: Request for a TSPortal test server.
Mar 6 2024, 12:54 · Infrastructure (SRE)

Feb 28 2024

Collei updated the task description for T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.
Feb 28 2024, 07:05 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei updated the task description for T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.
Feb 28 2024, 07:04 · Infrastructure (SRE), Varnish, MediaWiki, Production Error

Feb 26 2024

Xena added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

A user on Discord has reported it happening again, it's possible the issue wasn't fully resolved.

Feb 26 2024, 17:00 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

Sounds good

Feb 26 2024, 05:13 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Dicto added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

Hmm, that's weird but now I don't get Error 500 neither by importing pages on gameshows nor by editing with code editor on chernowiki. Looks like the problem is actually resolved.

Feb 26 2024, 04:13 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

Visual editor being broken is already tracked in T11903. As for the other issues, can you reproduce this on any wikis other than gameshowswiki?

Feb 26 2024, 01:15 · Infrastructure (SRE), Varnish, MediaWiki, Production Error

Feb 25 2024

Dicto added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

Still got Error 500 when try to import pages on gameshows.miraheze.org. Small xml files are going well when large (like 750 kb) are failing.

Feb 25 2024, 23:52 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Universal_Omega triaged T11902: Implement auto renewals for some wildcard domains in LetsEncrypt as Normal priority.
Feb 25 2024, 18:34 · SRE Automation, Infrastructure (SRE), SSL, Puppet, DNS
Agent_Isai closed T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions as Resolved.

Once again purged 13-16G of Varnish logs.

Feb 25 2024, 13:54 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei renamed T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions from 500 Internal Server Error - uploading images and editing pages to 500 Internal Server Error - uploading images, editing pages, and taking other actions.
Feb 25 2024, 01:54 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei merged T11900: XML ImportDump feature gives a "500 Internal Server" error into T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.
Feb 25 2024, 01:53 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei added a comment to T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions.

Several Discord users have reported this occurring recently and more frequently

Feb 25 2024, 01:51 · Infrastructure (SRE), Varnish, MediaWiki, Production Error

Feb 24 2024

RhinosF1 edited projects for T11891: 500 Internal Server Error - uploading images, editing pages, and taking other actions, added: Infrastructure (SRE); removed MediaWiki (SRE).
Feb 24 2024, 22:47 · Infrastructure (SRE), Varnish, MediaWiki, Production Error
Collei merged T11897: Configure CORS between wikis into T11275: API Requests to Wikibase Repositories are blocked.
Feb 24 2024, 21:28 · Infrastructure (SRE)

Feb 20 2024

MacFan4000 removed a member for Infrastructure (SRE): Paladox.
Feb 20 2024, 04:43
MacFan4000 removed a member for Infrastructure (SRE): Owen.
Feb 20 2024, 04:43

Feb 16 2024

OrangeStar placed T11857: Don't serve the MediaWiki-oriented CSP on Phorge up for grabs.
Feb 16 2024, 17:42 · Infrastructure (SRE)
OrangeStar claimed T11857: Don't serve the MediaWiki-oriented CSP on Phorge.
Feb 16 2024, 17:35 · Infrastructure (SRE)

Feb 15 2024

OrangeStar triaged T11857: Don't serve the MediaWiki-oriented CSP on Phorge as Low priority.
Feb 15 2024, 19:22 · Infrastructure (SRE)
Reception123 lowered the priority of T11033: Wiki deletion script run from Normal to Low.

Not a huge priority anymore

Feb 15 2024, 16:54 · MediaWiki, Infrastructure (SRE), Database
Reception123 added a comment to T11730: Rebalance database servers.

Oh, my bad. I assumed that this was done during the migration and this task just wasn't closed.

Feb 15 2024, 16:41 · Infrastructure (SRE), Database
Agent_Isai updated the task description for T11730: Rebalance database servers.
Feb 15 2024, 16:37 · Infrastructure (SRE), Database
Agent_Isai reopened T11730: Rebalance database servers, a subtask of T11729: Migrate databases to new cloud servers, as Open.
Feb 15 2024, 16:37 · Infrastructure (SRE), Database
Agent_Isai reopened T11730: Rebalance database servers as "Open".

We should still rebalance the successors.

Feb 15 2024, 16:37 · Infrastructure (SRE), Database
Reception123 added a comment to T11744: Create db162 (or db172?) and migrate core databases there.

I've been thinking... would this actually solve anything? The new global DB could still go down and take down the farm... however I do suppose it would be less likely...

What resources do you propose, CPU and RAM wise?

Feb 15 2024, 16:34 · Infrastructure (SRE), Database
Reception123 closed T11730: Rebalance database servers, a subtask of T11729: Migrate databases to new cloud servers, as Resolved.
Feb 15 2024, 16:32 · Infrastructure (SRE), Database
Reception123 closed T11730: Rebalance database servers as Resolved.

dbs has been migrated and the ones mentioned in this task no longer exist

Feb 15 2024, 16:32 · Infrastructure (SRE), Database
Reception123 lowered the priority of T11768: Misleading messages from icinga rDNS checks regarding unregistered domains from Normal to Low.

Triaging as low as domains that are not pointed aren't usually even removed on sight

Feb 15 2024, 16:29 · SRE Automation, Infrastructure (SRE)

Feb 13 2024

OrangeStar renamed T11851: check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain from check_reverse_dns should contact authoritative nameservers for the TLD directly on DNS checks to check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain.
Feb 13 2024, 20:37 · SRE Automation, Monitoring, SSL, Infrastructure (SRE)
RhinosF1 added projects to T11851: check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain: Monitoring, SRE Automation.
Feb 13 2024, 20:32 · SRE Automation, Monitoring, SSL, Infrastructure (SRE)
Reception123 triaged T11851: check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain as Low priority.
Feb 13 2024, 20:29 · SRE Automation, Monitoring, SSL, Infrastructure (SRE)
OrangeStar created T11851: check_reverse_dns should contact authoritative nameservers for the TLD directly when checking if we're the authoritative nameservers of a domain.
Feb 13 2024, 20:29 · SRE Automation, Monitoring, SSL, Infrastructure (SRE)
Universal_Omega lowered the priority of T11846: Alert on CirrusSearchElasticaWrite Job count from High to Normal.
Feb 13 2024, 06:16 · OpenSearch, Infrastructure (SRE), Monitoring

Feb 12 2024

RhinosF1 triaged T11846: Alert on CirrusSearchElasticaWrite Job count as High priority.
Feb 12 2024, 20:28 · OpenSearch, Infrastructure (SRE), Monitoring

Feb 7 2024

OrangeStar created T11809: Drop MediaWiki 1.40 from the CI pipeline.
Feb 7 2024, 16:16 · MediaWiki (SRE), Extensions

Feb 5 2024

Universal_Omega closed T11756: Rename phabricator.miraheze.org as Resolved.

This is now done.

Feb 5 2024, 18:43 · Phorge, Infrastructure (SRE)

Feb 4 2024

Universal_Omega edited projects for T11756: Rename phabricator.miraheze.org , added: Phorge; removed Phabricator.
Feb 4 2024, 00:19 · Phorge, Infrastructure (SRE)

Feb 3 2024

Universal_Omega closed T11457: Move sessions to either it's own memcache instance or to redis as Resolved.

This is now done.

Feb 3 2024, 22:02 · Infrastructure (SRE)
OrangeStar added a comment to T11770: Consider migrating to the Caddy webserver in our cache proxies.

It would be easier because you wouldn't need a command anymore, it would be fully automatic, which is the point of T11710. When Caddy sees a new domain name for the first time, it queries a HTTP server, and if it gets a 200 OK as a response, it generates a certificate for that domain. This all happens in the timespan of the first TLS ClientHello for that domain.

Feb 3 2024, 17:23 · Infrastructure (SRE), RequestSSL
Reception123 added a comment to T11770: Consider migrating to the Caddy webserver in our cache proxies.

I think I'm a bit confused with the terms here, since technically certificates are generated "automatically" (i.e. just with a command), so how would this make it easier for them to be generated based on a request sent by RequestSSL ?

Feb 3 2024, 16:16 · Infrastructure (SRE), RequestSSL
Reception123 triaged T11771: Improve cache proxy performance outside cp36/37 as Normal priority.
Feb 3 2024, 16:13 · Infrastructure (SRE)
Reception123 triaged T11770: Consider migrating to the Caddy webserver in our cache proxies as Low priority.
Feb 3 2024, 16:12 · Infrastructure (SRE), RequestSSL
Paladox renamed T11771: Improve cache proxy performance outside cp36/37 from Redesign cache proxy infra to Improve cache proxy performance outside cp36/37.
Feb 3 2024, 14:44 · Infrastructure (SRE)
Paladox updated the task description for T11771: Improve cache proxy performance outside cp36/37.
Feb 3 2024, 13:23 · Infrastructure (SRE)
Paladox created T11771: Improve cache proxy performance outside cp36/37.
Feb 3 2024, 13:22 · Infrastructure (SRE)
OrangeStar added a comment to T11770: Consider migrating to the Caddy webserver in our cache proxies.

https://github.com/caddyserver/nginx-adapter could make such a migration possible, but of course rewriting the config in JSON or Caddyfile would be best.

Feb 3 2024, 13:01 · Infrastructure (SRE), RequestSSL
OrangeStar updated the task description for T11770: Consider migrating to the Caddy webserver in our cache proxies.
Feb 3 2024, 12:09 · Infrastructure (SRE), RequestSSL
OrangeStar added a comment to T11770: Consider migrating to the Caddy webserver in our cache proxies.

Adding the RequestSSL tag since, while this is Miraheze-specific, it is of interest to me as a developer. If this is done it will define how I approach the Miraheze-specific hook handlers related to RequestSSL.

Feb 3 2024, 12:06 · Infrastructure (SRE), RequestSSL
OrangeStar created T11770: Consider migrating to the Caddy webserver in our cache proxies.
Feb 3 2024, 12:06 · Infrastructure (SRE), RequestSSL
OrangeStar renamed T11768: Misleading messages from icinga rDNS checks regarding unregistered domains from Misleading messages from icinga rDNS checks regarding domains not pointed correctly to Misleading messages from icinga rDNS checks regarding unregistered domains.
Feb 3 2024, 10:40 · SRE Automation, Infrastructure (SRE)
Agent_Isai added a comment to T11743: Deploy CirrusSearch.

Now deployed. If a wiki wishes to have it enabled, they can request it at Steward requests.

Feb 3 2024, 06:50 · Extensions, OpenSearch, Infrastructure (SRE), MediaWiki (SRE)
Agent_Isai closed T11743: Deploy CirrusSearch as Resolved.
Feb 3 2024, 06:49 · Extensions, OpenSearch, Infrastructure (SRE), MediaWiki (SRE)
Universal_Omega added a comment to T10642: Self-host the CVT feed bot.

Just as a quick update even though this was already resolved, did https://github.com/miraheze/puppet/pull/3731 to fully puppetize this, including build, so it should be installed 100% automatically on any new servers now.

Feb 3 2024, 04:26 · Monitoring, Infrastructure (SRE)
Universal_Omega closed T10642: Self-host the CVT feed bot as Resolved.
Feb 3 2024, 02:03 · Monitoring, Infrastructure (SRE)
Universal_Omega added a comment to T10642: Self-host the CVT feed bot.

https://github.com/Universal-Omega/CVTBot/commit/a2b07eb14ef9ff34c4428b42d80c2b3a2c9db91e removed the mono dependency to make this work. Then https://github.com/miraheze/puppet/pull/3727 for making it work on Miraheze. That patch is currently running on mon181 which seems to work!

Feb 3 2024, 00:41 · Monitoring, Infrastructure (SRE)

Feb 2 2024

OrangeStar updated the task description for T11768: Misleading messages from icinga rDNS checks regarding unregistered domains.
Feb 2 2024, 23:13 · SRE Automation, Infrastructure (SRE)
OrangeStar updated the task description for T11768: Misleading messages from icinga rDNS checks regarding unregistered domains.
Feb 2 2024, 23:12 · SRE Automation, Infrastructure (SRE)
RhinosF1 triaged T11768: Misleading messages from icinga rDNS checks regarding unregistered domains as Normal priority.
Feb 2 2024, 20:01 · SRE Automation, Infrastructure (SRE)
OrangeStar added a comment to T11768: Misleading messages from icinga rDNS checks regarding unregistered domains.

Fix for this would be having a separate exception handler for NXDOMAIN, instead of bundling it together with NoAnswer (https://github.com/miraheze/puppet/blob/fec5c1dfa8dd4592a727c41bc4e29155c229feca/modules/monitoring/files/check_reverse_dns.py#L125).

Feb 2 2024, 20:00 · SRE Automation, Infrastructure (SRE)
OrangeStar created T11768: Misleading messages from icinga rDNS checks regarding unregistered domains.
Feb 2 2024, 19:59 · SRE Automation, Infrastructure (SRE)

Feb 1 2024

Paladox triaged T11765: Look at deploying strongswan (IPSec) as Low priority.
Feb 1 2024, 16:43 · Infrastructure (SRE)
Universal_Omega closed T11754: Some servers missing from Grafana as Resolved.

Per above. Please do reopen if you notice others missing though.

Feb 1 2024, 12:09 · Infrastructure (SRE), Monitoring
Universal_Omega closed T11458: Operate redis as two instances for jobrunner as Declined.

For now, I don't believe this is needed, but if it becomes apparent that is we can do so eventually.

Feb 1 2024, 11:50 · Redis-JobRunner, Infrastructure (SRE)

Jan 31 2024

MacFan4000 added a comment to T11756: Rename phabricator.miraheze.org .

I did think of that, but I also agree that issue-tracker is a bit to long and making it simple is better. It's kinda difficult regardless of what is used, all have some negatives to it...

Jan 31 2024, 22:41 · Phorge, Infrastructure (SRE)
MacFan4000 added a comment to T11756: Rename phabricator.miraheze.org .

Also, for Phabricator CDN, we should use wikitide.net as miraheze.wiki is being deprecated and might be dropped outright one day.

Jan 31 2024, 22:37 · Phorge, Infrastructure (SRE)
Universal_Omega added a comment to T11756: Rename phabricator.miraheze.org .

I did think of that, but I also agree that issue-tracker is a bit to long and making it simple is better. It's kinda difficult regardless of what is used, all have some negatives to it...

Jan 31 2024, 20:08 · Phorge, Infrastructure (SRE)
Agent_Isai added a comment to T11756: Rename phabricator.miraheze.org .

I did think about that but not everyone on Phabricator is an issue but I might be getting to much into the specifics and semantics.

Jan 31 2024, 20:05 · Phorge, Infrastructure (SRE)
Universal_Omega added a comment to T11756: Rename phabricator.miraheze.org .

After further consideration, I think maybe issues.* would be better for simplicities sake, as Labster recommended. But not 100% decided on which is the best yet.

Jan 31 2024, 20:04 · Phorge, Infrastructure (SRE)
MacFan4000 added a comment to T11756: Rename phabricator.miraheze.org .

issue-tracker sounds good to me.

Jan 31 2024, 20:01 · Phorge, Infrastructure (SRE)
OrangeStar added a comment to T10857: Phase out LDAP as an authentication backend in favor of OIDC.

With self hosted mail gone for good here, I'd say this has become a lot more viable.

Jan 31 2024, 19:35 · Infrastructure (SRE)
Universal_Omega changed the status of T10642: Self-host the CVT feed bot from Open to In progress.
Jan 31 2024, 01:03 · Monitoring, Infrastructure (SRE)
Universal_Omega moved T10642: Self-host the CVT feed bot from Incoming to Short Term on the Infrastructure (SRE) board.
Jan 31 2024, 01:03 · Monitoring, Infrastructure (SRE)
Universal_Omega edited projects for T10642: Self-host the CVT feed bot, added: Infrastructure (SRE), Monitoring; removed MediaWiki, MediaWiki (SRE).
Jan 31 2024, 01:03 · Monitoring, Infrastructure (SRE)
Universal_Omega moved T11680: Create Miraheze/python-functions github repo & python package from Incoming to Short Term on the Infrastructure (SRE) board.
Jan 31 2024, 01:02 · Infrastructure (SRE), SRE Automation