Page MenuHomeMiraheze

cp9 extremely latent for a number of users
Open, HighPublic

Description

cp9 keeps on giving Error 500s for no reason, out of the blue.

Wiki-Bot has had a number of timeouts for requests taking over 18 seconds to the API. (Struck: revi, see comments) Various others users reporting slowness.

Related Objects

Mentioned Here
T6471: Wiki-Bot

Event Timeline

SCVSlalom created this task.
RhinosF1 renamed this task from cp9 Database Issues to cp9 extremely latent for a number of users on Discord.Wed, Nov 11, 21:16
RhinosF1 updated the task description. (Show Details)

This error has been fairly common on cp9 of late for me the past few days...

Error 503 Backend fetch failed, forwarded for REDACTED, 127.0.0.1
(Varnish XID 188680355) via cp9.miraheze.org at Tue, 10 Nov 2020 00:22:54 GMT.

If I have anything else to add, I will. @Universal_Omega, over to you.

@SCVSlalom Just a general friendly tip, a best practice is to let Herald add subscribers automatically. There are some exceptions to this, when a certain system administrator's advice or review would be helpful, but just in general, let Herald take care of that.

@SCVSlalom Just a general friendly tip, a best practice is to let Herald add subscribers automatically. There are some exceptions to this, when a certain system administrator's advice or review would be helpful, but just in general, let Herald take care of that.

Ok.

Reception123 renamed this task from cp9 extremely latent for a number of users on Discord to cp9 extremely latent for a number of users.Sat, Nov 14, 14:17



Wiki-bot is not going through CP9.

Zppix edited subscribers, added: Reception123; removed: Zppix.

(Ugh i hate phab on mobile)

@revi I'll create a new task for Wiki-Bot. See T6471

The task has been closed as Wiki Bot isn't operated by Miraheze.

The task has been closed as Wiki Bot isn't operated by Miraheze.

See T6471 as the dev is saying issue is on Miraheze's end, even though Wiki-Bot works off of mw6-7 and cp6-7.

Justification for another cp covering Amercias may be justified as looking at Grafana for 7 day rolling averages for requests/s;

cp3 - 30.9 (peak of 67)
cp6 - 23.7 (peak of 40)
cp7 - 25.7 (peak of 50)
cp9 - 68.9 (peak of 133)

In T6431#127592, @John wrote:

Justification for another cp covering Amercias may be justified as looking at Grafana for 7 day rolling averages for requests/s;

cp3 - 30.9 (peak of 67)
cp6 - 23.7 (peak of 40)
cp7 - 25.7 (peak of 50)
cp9 - 68.9 (peak of 133)

Thanks, @John. Do you know what the approximate cost per month, on average, for adding an additional cache proxy is, and do we have excess capacity on other cache proxies (i.e., in Europe) where we could take the least used cache proxy in Europe out of service and replace it with an additional cache proxy in North America?

Thanks,
Doug

Do you know what the approximate cost per month, on average, for adding an additional cache proxy is

It depends where it is, but another Americas proxy would probably cost around £5/mo.

and do we have excess capacity on other cache proxies (i.e., in Europe) where we could take the least used cache proxy in Europe out of service and replace it with an additional cache proxy in North America?

On paper, this would transfer the problem with Americas over to Europe.

@John Yeah, admittedly, I haven't looked at the traffic statistics on the European cache proxies, so perhaps I underestimated the amount of traffic we receive from Europe still. £5/mo is reasonable. We do likely need to hold another fundraiser early in the new year, to ensure out expenses do not outpace our revenues, but we do seem to have a reasonable enough cushion in our current account to justify the expense. As well, no doubt North American users will see improved performance in page load times which, in turn, can help to act as a catalyst for more donations.

Southparkfan added a subscriber: Southparkfan.

I find no abnormal metrics regarding cp9, nor do I have any other evidence saying cp9 is constrained by the available resources. Running mtr towards @Dmehus' IP shows some terrible latencies at multiple hops. Since @John has lots of expertise with networking, I'm interested if he could narrow this down to OVH network issues.