Page MenuHomeMiraheze

Investigate compressing databases by default
Closed, ResolvedPublic


While working on T5877, it made me realise if we compressed databases, we could not only store backups over a long period of time, but reduce the disk requirement for db* (currently at ~500G each) and increase the speed of backing up databases.

There are potentially other benefits - along with the unfortunately disadvantage that certain extensions will no longer work such a ReplaceText.

Event Timeline

John triaged this task as Normal priority.Oct 12 2021, 15:24
John created this task.
Reception123 added a subscriber: Agent_Isai.

@Agent_Isai Assigning this task to you as as John has mentioned above, there is the disadvantage of ReplaceText and other minor features (which I don't remember by heart) no longer working as a result of this potential change. Some wikis have already been compressed in the past and I recall some users being unhappy because of the ReplaceText issue.

Could you please reach out to users in some form (maybe CN?) and see what people think about this idea and about the disadvantage with ReplaceText?

It should be important to note MassEditRegex has more features than ReplaceText and works with compression.

I will make a thread on the Community noticeboard to inform users about the proposed change and I will also make sure to go around to the wikis which use the extension to inform them about the pros and cons of enabling database compression and get their input on this/discuss possible alternatives to any affected feature (like ReplaceText).

The Community noticeboard thread is live and so far has one support. As for going around the wikis, I will do that later and see what they think about the proposed change and the alternative solution to the affected extension.

In T8157#164325, @John wrote:

It should be important to note MassEditRegex has more features than ReplaceText and works with compression.

it uses the Revision class so we'll probably find it as a blocker for. 1.37

Reassigning from @Agent_Isai - thanks for initiating and receiving community feedback.

As a result of this feedback, we are going to take a different approach - rather than compress everything by default, we will compress everything except the latest revision weekly. This allows extensions such as ReplaceText to continue to function.

I am currently running this to get an understanding over roughly how much space we will have gained back.

Text is the big data user table - from a rough comparison we can look at each server to estimate roughly how much is saved by using database compression:

db11: ~10GB
db12: ~17GB
db13: ~10GB