Page MenuHomeMiraheze

provide db list with wiki names and custom domain names
Closed, ResolvedPublic

Description

so you are providing https://meta.miraheze.org/dblist/all.dblist
which is cool. and from there i have imported miraheze wikis in the past
but what that does not give me is the custom domain names for wikis with custom domains.

And since we know that API URLs dont work on the foo,miraheze.org URLs for those (https://phabricator.wikimedia.org/T146712)
i would need to add special code to solve that ticket and in addition i would have to manually enter all the custom domain names
and keep them maintained. I can do that once but i already know it will be outdated again quickly.

So to really solve that i would need a list i can download like the above but with a format like:

allthetropes|allthetropes.org
meta
fnord
..

where "meta" and "fnord" would be "regular" wikis and the one with a custom domain lists the domain name in addition

That way it would be possible for me to parse that and keep importing wikis and make that automatic.

What i also don't have is a list of _deleted_ wikis, we did that manually in the same ticket linked above. So either there would be a list of deleted wikis or the solution would be to always wipe the full table and then import the wikis from scratch and update them all.

Event Timeline

John added a subscriber: John.

Done some standardisation work for this. Custom domains will be handled via a puppet template and all wikis will be grafted through a cron. (duration to be decided still.)

Thanks, that looks just like what i wanted. Now i can get into automating this on wikistats side.

I'll need one more change please. The lists contain also the private wikis that are not accesible. Is it possible to filter them out and just show the public ones? It's the case for both lists but it matters more for the custom.txt.

example what i get when trying to talk to API of a private one:

    [code] => readapidenied
    [info] => You need read permission to use this module
    [*] => See https://wikicanada.miraheze.org/w/api.php for API usage
)

or, alternatively, you could say "all wikis should allow read permission to use statistics module" and let me get the numbers of users/admins/pages/etc. for all wikis even if the content itself is private. if that is possible i would just add all of the wikis to my stats, public or private content i would ignore.

dunno, what do you think?

There are a few different approaches possible here:

  1. we try and distinguish private and public wikis (currently there's real way of doing this unless we diff the custom domains against our list of private wikis)
  2. we investigate if Mediawiki has a built in way of allowing certain API modules to be "white listed" like we have a way to whitelist pages like Special:Login, CreateAccount, Main Page etc.
  3. we do a old hack we did with Parsoid where we allow a single static private IP to have the "read" permission globally.
  4. some bot or OAuth Magic where a single account "wikistats" can have the read permission globally.

If possible, 2 would be the best solution for all.
1 would suit you more than us and 4 would suit us more than you.
3 would also suit you but there's a risk of the IP being re-allocated, shared or so on. I imagine wikistats doesn't have its own floating public IP so this would definitely be the case of proxying and sharing correct?

After looking into it more - I've changed it to exclude all wikis listed as private.

Hopefully this should be correct when the next run occurs at midnight.