cp9 extremely latent for a number of users
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• SCVSlalom
	Nov 11 2020, 21:09

Description

cp9 keeps on giving Error 500s for no reason, out of the blue.

~~Wiki-Bot has had a number of timeouts for requests taking over 18 seconds to the API.~~ (Struck: revi, see comments) Various others users reporting slowness.

Related Objects

Mentioned Here: T6471: section.execute-GitInfo->getHeadCommitDate is very slow

Event Timeline

• SCVSlalom triaged this task as High priority.Nov 11 2020, 21:09

• SCVSlalom created this task.

Herald added subscribers: RhinosF1, Reception123. · View Herald TranscriptNov 11 2020, 21:09

• SCVSlalom added subscribers: Lakelimbo, Unknown Object (User).Nov 11 2020, 21:10

RhinosF1 renamed this task from cp9 Database Issues to cp9 extremely latent for a number of users on Discord.Nov 11 2020, 21:16

RhinosF1 updated the task description. (Show Details)

• SCVSlalom added subscribers: NDKilla, Paladox, Zppix.Nov 11 2020, 21:19

This error has been fairly common on cp9 of late for me the past few days...

Error 503 Backend fetch failed, forwarded for REDACTED, 127.0.0.1
(Varnish XID 188680355) via cp9.miraheze.org at Tue, 10 Nov 2020 00:22:54 GMT.

If I have anything else to add, I will. @Universal_Omega, over to you.

@SCVSlalom Just a general friendly tip, a best practice is to let Herald add subscribers automatically. There are some exceptions to this, when a certain system administrator's advice or review would be helpful, but just in general, let Herald take care of that.

In T6431#126026, @Dmehus wrote:

@SCVSlalom Just a general friendly tip, a best practice is to let Herald add subscribers automatically. There are some exceptions to this, when a certain system administrator's advice or review would be helpful, but just in general, let Herald take care of that.

Ok.

BrandonWM edited subscribers, added: BrandonWM; removed: • SCVSlalom.Nov 12 2020, 20:30

Reception123 renamed this task from cp9 extremely latent for a number of users on Discord to cp9 extremely latent for a number of users.Nov 14 2020, 14:17

BrandonWM updated the task description. (Show Details)Nov 19 2020, 19:31

Screen Shot 2020-11-19 at 11.32.18 AM.png (622×610 px, 92 KB)

Tested today: 2020-11-19 at 11:31 AM PST

Redmin subscribed.Nov 19 2020, 20:01

Wiki-bot is not going through CP9.

revi updated the task description. (Show Details)Nov 19 2020, 20:04

(Ugh i hate phab on mobile)

@revi I'll create a new task for Wiki-Bot. See T6471

The task has been closed as Wiki Bot isn't operated by Miraheze.

In T6431#126885, @Reception123 wrote:

The task has been closed as Wiki Bot isn't operated by Miraheze.

See T6471 as the dev is saying issue is on Miraheze's end, even though Wiki-Bot works off of mw6-7 and cp6-7.

Justification for another cp covering Amercias may be justified as looking at Grafana for 7 day rolling averages for requests/s;

cp3 - 30.9 (peak of 67)
cp6 - 23.7 (peak of 40)
cp7 - 25.7 (peak of 50)
cp9 - 68.9 (peak of 133)

In T6431#127592, @John wrote:

Justification for another cp covering Amercias may be justified as looking at Grafana for 7 day rolling averages for requests/s;

cp3 - 30.9 (peak of 67)
cp6 - 23.7 (peak of 40)
cp7 - 25.7 (peak of 50)
cp9 - 68.9 (peak of 133)

Thanks, @John. Do you know what the approximate cost per month, on average, for adding an additional cache proxy is, and do we have excess capacity on other cache proxies (i.e., in Europe) where we could take the least used cache proxy in Europe out of service and replace it with an additional cache proxy in North America?

Thanks,
Doug

In T6431#127594, @Dmehus wrote:

Do you know what the approximate cost per month, on average, for adding an additional cache proxy is

It depends where it is, but another Americas proxy would probably cost around £5/mo.

and do we have excess capacity on other cache proxies (i.e., in Europe) where we could take the least used cache proxy in Europe out of service and replace it with an additional cache proxy in North America?

On paper, this would transfer the problem with Americas over to Europe.

@John Yeah, admittedly, I haven't looked at the traffic statistics on the European cache proxies, so perhaps I underestimated the amount of traffic we receive from Europe still. £5/mo is reasonable. We do likely need to hold another fundraiser early in the new year, to ensure out expenses do not outpace our revenues, but we do seem to have a reasonable enough cushion in our current account to justify the expense. As well, no doubt North American users will see improved performance in page load times which, in turn, can help to act as a catalyst for more donations.

Assigning to SPF to review

I find no abnormal metrics regarding cp9, nor do I have any other evidence saying cp9 is constrained by the available resources. Running mtr towards @Dmehus' IP shows some terrible latencies at multiple hops. Since @John has lots of expertise with networking, I'm interested if he could narrow this down to OVH network issues.

Dmehus awarded a token.Nov 28 2020, 22:35

Is this still an issue?

In T6431#129700, @Southparkfan wrote:

Is this still an issue?

As far as I've noticed recently, yes. However maybe this should not be a high priority task as it has no current solution, and is not an urgent issue at the current moment, however if it continues to continuasly persist it may soon become urgent.

Filed a ticket with OVH.

@John: Any progress on ticket?

Confirmed the problem no longer exists, OVH migrated the VPS to another physical node on a different network port.

In T6431#132121, @John wrote:

Confirmed the problem no longer exists, OVH migrated the VPS to another physical node on a different network port.

Ah, that makes sense. Thanks for the follow up.

	F1348687: Screen Shot 2020-11-19 at 11.32.18 AM.png
	Nov 19 2020, 19:33

	F1348697: image.png
	Nov 19 2020, 20:02

	F1348701: image.png
	Nov 19 2020, 20:02

	F1348699: image.png
	Nov 19 2020, 20:02

cp9 extremely latent for a number of usersClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

cp9 extremely latent for a number of users
Closed, ResolvedPublic
Actions