Page MenuHomeMiraheze

importDump.php stopped assigning imported edits to existing users
Closed, DeclinedPublic

Description

Imports per requests on Phabricator are done with importDump.php script...
The script assigns edits to local users by default where they exist.

Alas, now due to some bug with the use of --username-prefix, that no longer works.

  1. Create a user, "Wolfsburg".
  2. Then import an XML containing contributions with that same exact user name.
  3. You will observe they are assigned to non-user "Imported>Wolfsburg"‎

They only way they will get assigned to existing users is by using Special:Import and checking the checkbox.

Event Timeline

Jidanni triaged this task as Normal priority.Jan 3 2020, 14:06
Jidanni created this task.
PS C:\plavormind\web\public\wiki\mediawiki\maintenance> C:/plavormind/php-ts/php.exe importDump.php --help --wiki "exit"

This script reads pages from an XML file as produced from Special:Export or
dumpBackup.php, and saves them into the current wiki.

Compressed XML files may be read directly:
  .gz ok
  .bz2 (disabled; requires PHP bzip2 module)
  .7z (if 7za executable is in PATH)

Note that for very large data sets, importDump.php may be slow; there are
alternate methods which can be much faster for full site restoration:
<https://www.mediawiki.org/wiki/Manual:Importing_XML_dumps>

Usage: php importDump.php [--conf|--dbgroupdefault|--dbpass|--dbuser|--debug|--dry-run|--globals|--help|--image-base-path|--memory-limit|--mwdebug|--namespaces|--no-local-users|--no-updates|--profiler|--quiet|--report|--rootpage|--server|--skip-to|--uploads|--username-prefix|--wiki] [file]

Generic maintenance parameters:
    --help (-h): Display this help message
    --quiet (-q): Whether to suppress non-error output
    --conf: Location of LocalSettings.php, if not default
    --wiki: For specifying the wiki ID
    --globals: Output globals at the end of processing for debugging
    --memory-limit: Set a specific memory limit for the script, "max"
        for no limit or "default" to avoid changing it
    --server: The protocol and server name to use in URLs, e.g.
        http://en.wikipedia.org. This is sometimes necessary because server name
        detection may fail in command line scripts.
    --profiler: Profiler output format (usually "text")
    --mwdebug: Enable built-in MediaWiki development settings

Script dependant parameters:
    --dbuser: The DB user to use for this script
    --dbpass: The password to use for this script
    --dbgroupdefault: The default DB group to use.

Script specific parameters:
    --debug: Output extra verbose debug information
    --dry-run: Parse dump without actually importing pages
    --image-base-path: Import files from a specified path
    --namespaces: Import only the pages from namespaces belonging to the
        list of pipe-separated namespace names or namespace indexes
    --no-local-users: Treat all usernames as interwiki. The default is
        to assign edits to local users where they exist.
    --no-updates: Disable link table updates. Is faster but leaves the
        wiki in an inconsistent state
    --report: Report position and speed after every n pages processed
    --rootpage: Pages will be imported as subpages of the specified page
    --skip-to: Start from nth page by skipping first n-1 pages
    --uploads: Process file upload data if included (experimental)
    --username-prefix: Prefix for interwiki usernames

Arguments:
    [file]: Dump file to import [else use stdin]

PS C:\plavormind\web\public\wiki\mediawiki\maintenance>

This is strange. According to help from importDump.php script, it says "--no-local-users: Treat all usernames as interwiki. The default is to assign edits to local users where they exist." so it should assign edits to local users where they exist unless the --no-local-users parameter is passed, and I don't think this is an expected behavior.

Yeh was about to say what ^ said above. But I’m currently trying to fix up our script to assign imported edits to users.

John subscribed.

Purpose of the task isn’t appropriate for phanricator

Jidanni renamed this task from Warn that imports done by staff will never get assigned to existing users to ImportDump.php stopped assigning imported edits to existing users.Jan 4 2020, 01:18
Jidanni reopened this task as Open.
Jidanni updated the task description. (Show Details)

OK, I fixed the bug description.

PlavorSeol renamed this task from ImportDump.php stopped assigning imported edits to existing users to importDump.php stopped assigning imported edits to existing users.Jan 4 2020, 11:54
Reception123 claimed this task.
Reception123 edited projects, added Upstream; removed Import.

https://phabricator.wikimedia.org/T206683 and Adds a prefix to usernames. "Due to this bug it may be necessary to specify --user-prefix="" when importing files." related

Unfortunately this seems to be an upstream issue and nothing we can do on the Miraheze side of things (except maybe try that --user-prefix="" thing).