Page MenuHomeMiraheze

Cleanup database mess after cloud14 outages
Closed, ResolvedPublic

Description

  • Drop databases that don't have a corresponding cw_wikis entry (this should be done first, as then we would remove from what is listed before for wikis that have no databases after this point to properly cleanup)
  • Remove wikis from cw_wikis, mw_settings, mw_namespaces, mw_permissions, matomo, gnf_files, localuser, localnames (mhglobal/testglobal), and echo_unread_wikis (metawiki/betawiki) that don't have a corresponding database existing in clusters
  • Remove swift containers for wikis that have no database/do not exist

Due to some mess with cloud14, this is the case with some wikis.

Additional my new maintenance that I added for DataDump can probably be ran to cleanup the mess in DataDump, and regenerate the data_dump table on wikis, and remove duplicate backups, etc... to fix that.

Related Objects

Event Timeline

Unknown Object (User) triaged this task as High priority.Jan 31 2023, 20:39
Unknown Object (User) created this task.
Unknown Object (User) updated the task description. (Show Details)Feb 1 2023, 22:04

I'm not fully sure why this would be high priority, is it affecting something?

For databases that don't have a corresponding cw_wikis entry, might as well do that for all clouds then. Also, for wikis that don't have a corresponding database, could they not be removed via the regular eval way? (I ask since you listing tables makes it seem like it all has to be done manually)

Unknown Object (User) added a comment.Feb 2 2023, 06:34

I'm not fully sure why this would be high priority, is it affecting something?

For databases that don't have a corresponding cw_wikis entry, might as well do that for all clouds then. Also, for wikis that don't have a corresponding database, could they not be removed via the regular eval way? (I ask since you listing tables makes it seem like it all has to be done manually)

Maybe not necessary high priority, but the cleanup should be somewhat of higher priority since it also causes a whole bunch of warnings, etc... when running scripts based on extensions, and we have quite the mess now.

If they don't have a database, or if they do but not a cw_wikis entry, I am unsure if the normal eval way works or not.

Wikis that only have cw_wikis entries (no DB):

contestofnoveltracksexcludingserverteapartywiki
fictionandfictionalsettingswikiwiki
floriantheoryshakespeareauthorshipwiki
matteomosciatticinamaticmultiversewiki
pinkelephantswithheffalumpsandwoozleswiki
tunelesssingingmonstersandislandswiki

Wikis that only have DB entries (no cw_wikis):

closinglogosgroupwiki
contestofnoveltracksexcludingserverteapartywiki|
countersidewiki
countryballwiki
creativeenergywiki
crimsonconsortiumwiki
criticalopswiki
crtlistwiki
culminationinfowiki
cyawwiki
cydtriwiki
davidslistwiki
ddlcmodsarchivalwiki
debluemcwiki
delishwiki
denepiskaservernwiki
derekwiki
deutschingabzwiki
dialoguewiki
diamodocswiki
dictrollnarywiki
diegoledezmawiki
digitalidentitywiki
discuswiki
dmcharliewiki
dodekairamenwiki
dragonsnestwiki
dreadfulrestaurantsandfoodwikiwiki|
drovawiki
dynguwiki
easalertswiki
eascwiki
ecsuswiki
edenwiki
edfundingwiki
edtechbiowiki
edworldowiki
einsofwiki
emgpwiki
encyclobudddywiki
encyclopediawiki
encyclotainmentwiki
eninternetpediawiki
enmarchewiki
ensembleshipswiki
equilibriumworldwiki
escunitedwvwiki
estelhardawiki
estypemoonwiki
evabfwiki
everythingwindowswiki
evezhianswiki
excellenthardwarewiki
expatsnlwiki
experimycowiki
f1rchwiki
fakemediacollabwikiwiki
fancontestcommunitywiki
fantastyunitednationswiki
farrlandelectionswiki
feministsindiawiki
fengariwiki
fictionandfictionalsettingswikiwiki|
flawlessgamemodswiki
flintforgedeepwiki
floriantheoryshakespeareauthorshipwiki|
formula1plwiki
formulasimracingwiki
formulerz8wiki
forstahjalpenwiki
forzawiki
foypalthistwiki
fractureddominionwiki
fridaynightsambinwiki
frontierwiki
futebolwiki
gachatubewiki
gachavisionwiki
gagikajwiki
galactuswiki
gamesoundwiki
gaminggroupwiki
gatherpediawiki
gloriousdawnwiki
godawfulschoolswiki
matteomosciatticinamaticmultiversewiki|
pinkelephantswithheffalumpsandwoozleswiki|
pjmaskswiki

rumctrollwiki
themovieawiki
tunelesssingingmonstersandislandswiki|
vrtwikiwiki
webkinzpictureguidewiki
zoranleaguewiki

beta ones for separate deletion:

test10cwerrorwikibeta
test11cwerrorwikibeta
test12cwerrorwikibeta
test13cwerrorwikibeta
test14cwerrorwikibeta
test15cwerrorwikibeta
test16cwerrorwikibeta
test2cwerrorwikibeta
test2cwwikibeta
test3cwerrorwikibeta
test3cwwikibeta
test4cwerrorwiki
test4cwerrorwikibeta
test5cwerrorwikibeta
test6cwerrorwikibeta
test7cwerrorwikibeta
test8cwerrorwikibeta
test9cwerrorwikibeta
testcreatebetawiki2wiki
testcreatebetawiki3wikibeta
testcreatebetawiki4wikibeta
testcreatebetawikiwiki
testcreatebetawikiwikibeta
testcwerrorwikibeta
testcwpatchwikibeta
testrwwikibeta
testswift1wikibeta
testswift2wikibeta
testswift5wikibeta
metawikibeta
privtestwikibeta
cwtestfornewmwwikibeta

@Universal_Omega Would you mind double checking the lists to make sure that everything can be deleted?

Unknown Object (User) added a comment.Feb 2 2023, 06:53

@Universal_Omega Would you mind double checking the lists to make sure that everything can be deleted?

Definitely not betawiki and the wikibetas, those are on testglobal. But if we delete from cw_wikis, we'll have to do the same check on the others, ones that have no mw_settings, etc... but no DB after.

@Universal_Omega Would you mind double checking the lists to make sure that everything can be deleted?

Definitely not betawiki and the wikibetas, those are on testglobal. But if we delete from cw_wikis, we'll have to do the same check on the others, ones that have no mw_settings, etc... but no DB after.

Yeah, definitely not betawiki itself but we could probably delete the other test beta wikis. And do you really think there's a chance some wikis even have mw_settings entries but not cw_wikis?

Unknown Object (User) added a comment.Feb 2 2023, 06:58

@Universal_Omega Would you mind double checking the lists to make sure that everything can be deleted?

Definitely not betawiki and the wikibetas, those are on testglobal. But if we delete from cw_wikis, we'll have to do the same check on the others, ones that have no mw_settings, etc... but no DB after.

Yeah, definitely not betawiki itself but we could probably delete the other test beta wikis. And do you really think there's a chance some wikis even have mw_settings entries but not cw_wikis?

I'm almost certain there is, but not 100%. I am however 100% sure there will be some in matomo, and gnf_files, as when doing resets those two weren't wiped, as I forgot about them. Pretty sure there could be some in each of the table being different due to some issues with resets, renames, etc...

438 entries for mw_settings don't have a corresponding cw_wikis entry. I see many of them are from P468 so I'm thinking maybe there's an issue with the deletion script? But the issue is not all wikis in P468 are still in mw_settings so that's confusing.

Unknown Object (User) added a comment.Feb 2 2023, 08:15

438 entries for mw_settings don't have a corresponding cw_wikis entry. I see many of them are from P468 so I'm thinking maybe there's an issue with the deletion script?

Maybe I didn't quite explain correctly, but I alsk have said that it was because the deletion script messed up once, it was fixed, but caused massive issues with things not being properly deleted that one time also.

gnf_files, mw_settings, mw_namespaces, mw_permission duplicates removed. In addition, removed some wrong entries which had non-wiki dbnames.

Unknown Object (User) added a comment.EditedFeb 2 2023, 19:56

Created a script to handle Swift part:

import os
import re
import subprocess


# Get all database names from cw_wikis
db_names = []
result = os.popen("sudo -u www-data php /srv/mediawiki/w/maintenance/sql.php --wiki=metawiki --wikidb=mhglobal --query='SELECT wiki_dbname FROM cw_wikis;' --json | jq -r '.[].wiki_dbname'").read()
for line in result.split("\n"):
    if line:
        db_names.append(line)

# Get all container names (must first run . /etc/swift-env.sh to use this)
containers = []
result = subprocess.check_output(["swift", "list"])
for line in result.decode().split("\n"):
    if line:
        containers.append(line)

# Check if each container corresponds to a database
missing_wikis = ['root', 'betawiki']
missing_containers = []
for container in containers:
    database = re.sub("miraheze-|-(local-[a-z]*|avatars|awards|dumps-backup|timeline-render|score-render|createwiki-persistent-model)", "", container)
    if database in db_names:
        continue
    elif database not in missing_wikis:
        # Remove duplicates
        missing_wikis.append(database)
        print(f"{container} does not correspond to any database in cw_wikis (looked for {database})")

    if database not in ('betawiki', 'root'):
        missing_containers.append(container)
        subprocess.run(["swift", "delete", container])

print(f"\nContainers deleted: {len(missing_containers)}")

It will also delete all sharded and non-wiki containers that exist, and completely unused/accidentally created a little while ago.

Running without delete:

.... a lot of containers ....
Containers found: 191
Unique wikis found (sharded containers will count as a unique wiki): 162

And excluding sharded containers:

miraheze-jumbods65wiki-dumps-backup does not correspond to any database in cw_wikis (looked for jumbods65wiki)
miraheze-tempimportwiki-local-deleted does not correspond to any database in cw_wikis (looked for tempimportwiki)
miraheze-test2cwwikibeta-local-deleted does not correspond to any database in cw_wikis (looked for test2cwwikibeta)
miraheze-test3cwwikibeta-local-deleted does not correspond to any database in cw_wikis (looked for test3cwwikibeta)
miraheze-testrwwikibeta-local-deleted does not correspond to any database in cw_wikis (looked for testrwwikibeta)
miraheze-testswift2wikibeta-avatars does not correspond to any database in cw_wikis (looked for testswift2wikibeta)
miraheze-testswift5wikibeta-dumps-backup does not correspond to any database in
cw_wikis (looked for testswift5wikibeta)
theredpionnerwiki_images does not correspond to any database in cw_wikis (looked for theredpionnerwiki_images)

Containers found: 33
Unique wikis found: 8

It will also delete the beta wiki's containers (except betawiki itself, which has been excluded) as those wikis will be deleted anyway, so should be fine to delete their containers.

Unknown Object (User) updated the task description. (Show Details)Feb 5 2023, 01:12
Unknown Object (User) closed this task as Resolved.Feb 5 2023, 01:22
Unknown Object (User) claimed this task.
Unknown Object (User) updated the task description. (Show Details)