Page MenuHomeMiraheze

importing the pages_full.xml file for AnimeBaths.Fandom (and perhaps images later?)
Closed, ResolvedPublic

Description

Hello we are trying to rebuild a Wikia/Fandom project that was lost in October 2018 due to restructuring. The .7z file is no longer on Fandom but Watermaiden15 and I both downloaded copies and can give them to staff here. I tried to import them but faced problems with timing out, I think due to the large file size.

animebaths_pages_full.xml.7z is 8.5MiB due to compression but decompressed it is 291.9MiB which I think exceeds the 250MB upload allowance for non-Phabricator users.

I had also gotten help from a random good samaritan last autumn who ran some script to download the majority of the images and they put them in a Mega.nz account. It's about 30 gigabytes and I'm wondering if I was able to give Phabricator staff the link for it if they might know how to automate the upload of those files to Miraheze? The XML only has the text history of file uploads I believe and not the actual images themselves.

Event Timeline

Tycio triaged this task as Low priority.Apr 17 2019, 04:06
Tycio created this task.
Paladox added a subscriber: Paladox.May 4 2019, 15:47

Hello,

Sorry for the late reply, we can certainly import your xml dump into your wiki. (you can use google drive, dropbox or any other cloud provider to store it). For files, we have around ~70gb left. So it may not be particle to upload all of them at the moment.

Tycio added a comment.May 9 2019, 04:10

I had set up a Cloud Drive on Mega.nz back in November for this purpose if that is okay?

https://mega.nz/#!rv5CRACY!1cYWFmHnN84FIvzCVKXWjxn2W78Fc4ksuap0AtFxFFw is the link for animebaths_pages_full.xml.7z which is 8.5 MB compressed.

This would be the most important first step to restore the text source of our articles.

Whenever that is done, step 2 would be the images. https://mega.nz/#!niAHXYyJ!ayYpoBkrRSDYNJSc2C8-LE_JdLIY4-pDRAcUiTYdt0E has animebathswiki_images.tar.gz which is 14.30 GB which is a pretty massive undertaking so I can understand why that may have to be put off for a long while if all of Miraheze only has 70 left to work with.

With the infrastructure back we can begin to build articles on new series while waiting for the old ones to eventually be restored, but it will be good to have at least the text which noted all the issue and episode numbers and information.

Paladox added a comment.May 18 2019, 15:14

@Tycio really sorry for the delay, your wiki is https://animebaths.miraheze.org/wiki/Main_Page right? (just asking so i can start importing your xml dump).

Paladox added a comment.May 18 2019, 15:17

Yup, that's your wiki, will start the import!

Paladox raised the priority of this task from Low to Normal.May 18 2019, 15:36
Paladox removed a project: MediaWiki.
Paladox closed this task as Resolved.May 23 2019, 23:15
Paladox claimed this task.