Migrating mediawiki from mysql to sqlite

From Linuxintro
Revision as of 08:28, 11 September 2013 by imported>ThorstenStaerk (→‎migrate database below your wiki)

Overview

To migrate a mediawiki from MySQL to SQLite there are two basic approaches:

  • start with a new database and worry about
    • dumping all pages but the Main Page with the dumpBackup command and restoring them with the importDump command
    • keeping the articles' revisions
    • copying the Main Page from the old wiki to the new
    • copying the users and privileges from the old wiki to the new
    • copying the wiki statistics (like page visits) from the old wiki to the new
    • copying images and other files from the old wiki to the new
    • re-doing things like mediawiki extensions on the new wiki
    • re-doing your settings e.g. from LocalSettings.php on the new wiki
Matt gives a good overview about this
  • just migrate the database below your wiki

migrate database below your wiki

I migrated the database below my wiki and found it easy - as long as you know what to do and how to react on errors. This shows what to do, the TroubleShooting section shows how to react on errors. I used mediawiki 1.21.2 on SUSE Linux 12.1.

  • First create a backup (aka export aka dump) of the database:
mysqldump wikidb -u wikiuser -p > dump.sql
  • Get this MySQL -> SQLite converter
wget https://gist.github.com/esperlu/943776/raw/dd87f4088f6d5ec7563478f7a28a37ba02cf26e2/mysql2sqlite.sh
  • I started this converter to create a file /srv/www/htdocs/wikidb.sqlite:
sh mysql2sqlite.sh -u wikiuser -p wikidb | sqlite3 /srv/www/htdocs/wikidb.sqlite
  • change LocalSettings.php to reflect
$wgDBtype = "sqlite";
$wgDBserver = "";
$wgDBname = "wikidb";
$wgDBuser = "";
$wgDBpassword = "";
$wgSQLiteDataDir = "/srv/www/htdocs";
  • allow your webserver to write to the file:
chown wwwrun:www /srv/www/htdocs/wikidb.sqlite

Database query syntax error

Symptom
When editing an article article and clicking on "Save page" you get the error message
A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:
   INSERT INTO phase3text (old_id,old_text,old_flags) VALUES (NULL,'= main.cpp = 
[...]
from within function "Revision::insertOn/single-row". Database returned error "19: phase3text.old_id may not be NULL".

The query may or may not be shown, depending on your LocalSettings.php. The following wrong definition is in your database for your text table:

CREATE TABLE "phase3text" (
 "old_id" int(10)  NOT NULL ,

This definition is missing the autoincrement setting for old_id.

Reason
The database transformation did not take over the autoincrement setting.
Solution
Fix the tables text, revision and recentchanges like this:
sqlite> CREATE TABLE text (
   ...>  old_id INTEGER  NOT NULL PRIMARY KEY AUTOINCREMENT,
   ...>  old_text BLOB NOT NULL,
   ...>  old_flags BLOB NOT NULL
   ...>  );
sqlite> insert into text select * from phase3text;
sqlite> drop table phase3text;
sqlite> alter table text rename to phase3text;


sqlite> CREATE TABLE revision (
   ...>  rev_id INTEGER  NOT NULL PRIMARY KEY AUTOINCREMENT,
   ...>  rev_page INTEGER  NOT NULL,
   ...>  rev_text_id INTEGER  NOT NULL,
   ...>  rev_comment BLOB NOT NULL,
   ...>  rev_user INTEGER  NOT NULL default 0,
   ...>  rev_user_text TEXT  NOT NULL default '',
   ...>  rev_timestamp BLOB NOT NULL default '',
   ...>  rev_minor_edit INTEGER  NOT NULL default 0,
   ...>  rev_deleted INTEGER  NOT NULL default 0,
   ...>  rev_len INTEGER ,
   ...>  rev_parent_id INTEGER  default NULL,
   ...>  rev_sha1 BLOB NOT NULL default '',
   ...>  rev_content_model BLOB DEFAULT NULL,
   ...>  rev_content_format BLOB DEFAULT NULL
   ...>  );
sqlite> CREATE INDEX page_timestamp ON revision (rev_page,rev_timestamp);
sqlite> CREATE INDEX page_user_timestamp ON revision (rev_page,rev_user,rev_timestamp);
sqlite> CREATE UNIQUE INDEX rev_page_id ON revision (rev_page, rev_id);
sqlite> CREATE INDEX rev_timestamp ON revision (rev_timestamp);
sqlite> CREATE INDEX user_timestamp ON revision (rev_user,rev_timestamp);
sqlite> CREATE INDEX usertext_timestamp ON revision (rev_user_text,rev_timestamp);
sqlite> insert into revision select * from phase3revision;
sqlite> drop table phase3revision;
sqlite> alter table revision rename to phase3revision;


sqlite> CREATE TABLE recentchanges (
   ...>  rc_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,                              
   ...>  rc_timestamp BLOB NOT NULL default '',                                         
   ...>  rc_cur_time BLOB NOT NULL default '',
   ...>  rc_user INTEGER  NOT NULL default 0,
   ...>  rc_user_text TEXT  NOT NULL,
   ...>  rc_namespace INTEGER NOT NULL default 0,
   ...>  rc_title TEXT  NOT NULL default '',
   ...>  rc_comment TEXT  NOT NULL default '',
   ...>  rc_minor INTEGER  NOT NULL default 0,
   ...>  rc_bot INTEGER  NOT NULL default 0,
   ...>  rc_new INTEGER  NOT NULL default 0,
   ...>  rc_cur_id INTEGER  NOT NULL default 0,
   ...>  rc_this_oldid INTEGER  NOT NULL default 0,
   ...>  rc_last_oldid INTEGER  NOT NULL default 0,
   ...>  rc_type INTEGER  NOT NULL default 0,
   ...>  rc_patrolled INTEGER  NOT NULL default 0,
   ...>  rc_ip BLOB NOT NULL default '',
   ...>  rc_old_len INTEGER,
   ...>  rc_new_len INTEGER,
   ...>  rc_deleted INTEGER  NOT NULL default 0,
   ...>  rc_logid INTEGER  NOT NULL default 0,
   ...>  rc_log_type BLOB NULL default NULL,
   ...>  rc_log_action BLOB NULL default NULL,
   ...>  rc_params BLOB NULL
   ...>  );
sqlite> CREATE INDEX new_name_timestamp ON recentchanges (rc_new,rc_namespace,rc_timestamp);
sqlite> CREATE INDEX rc_cur_id ON recentchanges (rc_cur_id);
sqlite> CREATE INDEX rc_ip ON recentchanges (rc_ip);
sqlite> CREATE INDEX rc_namespace_title ON recentchanges (rc_namespace, rc_title);
sqlite> CREATE INDEX rc_ns_usertext ON recentchanges (rc_namespace, rc_user_text);
sqlite> CREATE INDEX rc_timestamp ON recentchanges (rc_timestamp);
sqlite> CREATE INDEX rc_user_text ON recentchanges (rc_user_text, rc_timestamp);
sqlite> insert into recentchanges select * from phase3recentchanges;sqlite> drop table phase3recentchanges;
sqlite> alter table recentchanges rename to phase3recentchanges;

TroubleShooting

I got quite a lot of error messages that were easy to fix - provided you know how.

Revision::insertOn/single-row

Symptom: When editing an article article and clicking on "Save page" you get an error message like

A database query syntax error has occurred. This may indicate a bug in the software. The last attempted database query was:
   INSERT INTO phase3text (old_id,old_text,old_flags) VALUES (NULL,'= main.cpp = 
[...]
from within function "Revision::insertOn/single-row". Database returned error "19: phase3text.old_id may not be NULL".

Your error message may not contain the database query depending on your LocalSettings.php.

Reason
The database transformation has not worked perfectly. The "autoincrement" setting for some columns got lost. While your database has a definition
CREATE TABLE "phase3text" (
 "old_id" int(10)  NOT NULL ,

the right one would be

CREATE TABLE text (
 old_id INTEGER  NOT NULL PRIMARY KEY AUTOINCREMENT,

fixing this changes the error message. Also the table revision needs an autoincrement. Then recentchanges. Then it works :)

Solution
For the tables text, revision and recentchanges introduce the right autoincrement:
sqlite> CREATE TABLE recentchanges (
   ...>  rc_id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,                              
   ...>  rc_timestamp BLOB NOT NULL default '',                                         
   ...>  rc_cur_time BLOB NOT NULL default '',
   ...>  rc_user INTEGER  NOT NULL default 0,
   ...>  rc_user_text TEXT  NOT NULL,
   ...>  rc_namespace INTEGER NOT NULL default 0,
   ...>  rc_title TEXT  NOT NULL default '',
   ...>  rc_comment TEXT  NOT NULL default '',
   ...>  rc_minor INTEGER  NOT NULL default 0,
   ...>  rc_bot INTEGER  NOT NULL default 0,
   ...>  rc_new INTEGER  NOT NULL default 0,
   ...>  rc_cur_id INTEGER  NOT NULL default 0,
   ...>  rc_this_oldid INTEGER  NOT NULL default 0,
   ...>  rc_last_oldid INTEGER  NOT NULL default 0,
   ...>  rc_type INTEGER  NOT NULL default 0,
   ...>  rc_patrolled INTEGER  NOT NULL default 0,
   ...>  rc_ip BLOB NOT NULL default '',
   ...>  rc_old_len INTEGER,
   ...>  rc_new_len INTEGER,
   ...>  rc_deleted INTEGER  NOT NULL default 0,
   ...>  rc_logid INTEGER  NOT NULL default 0,
   ...>  rc_log_type BLOB NULL default NULL,
   ...>  rc_log_action BLOB NULL default NULL,
   ...>  rc_params BLOB NULL
   ...>  );
sqlite> CREATE INDEX new_name_timestamp ON recentchanges (rc_new,rc_namespace,rc_timestamp);
sqlite> CREATE INDEX rc_cur_id ON recentchanges (rc_cur_id);
sqlite> CREATE INDEX rc_ip ON recentchanges (rc_ip);
sqlite> CREATE INDEX rc_namespace_title ON recentchanges (rc_namespace, rc_title);
sqlite> CREATE INDEX rc_ns_usertext ON recentchanges (rc_namespace, rc_user_text);
sqlite> CREATE INDEX rc_timestamp ON recentchanges (rc_timestamp);
sqlite> CREATE INDEX rc_user_text ON recentchanges (rc_user_text, rc_timestamp);
sqlite> insert into recentchanges select * from phase3recentchanges;sqlite> drop table phase3recentchanges;
sqlite> alter table recentchanges rename to phase3recentchanges;

PDO exception

Symptom: When surfing to the wiki you get a page displaying this error message:

Unexpected non-MediaWiki exception encountered, of type "PDOException"
exception 'PDOException' with message 'There is no active transaction' in /srv/www/htdocs/mediawiki/includes/db/DatabaseSqlite.php:664
Stack trace:

Solution: In LocalSettings.php set all occurrences of $wgDBuser to "".

$wgDBuser = "";

Is not unique

Symptom: When calling mysql2sqlite.sh you get a lot of error messages like

Error: near line 565: column cat_id is not unique
Error: near line 566: column cat_id is not unique

Solution: You cannot call mysql2sqlite.sh twice to the same target file. Delete the target file before you start the next conversion, like this:

rm test.sqlite
sh /root/mysql2sqlite.sh -u wikiuser -p wikidb | sqlite3 test.sqlite

file is encrypted or not a database

Symptom: When trying to open the sqlite database on the command line you get the error message

Unable to open database "../../data/my_wiki.sqlite": file is encrypted or is not a database

Solution: In my case I used the command sqlite to open the database. sqlite pointed to version 2 of sqlite. When I replaced the command sqlite by sqlite3 it worked:

tweedleburg:/srv/www/htdocs/mediawiki # sqlite ../../data/my_wiki.sqlite 
Unable to open database "../../data/my_wiki.sqlite": file is encrypted or is not a database
tweedleburg:/srv/www/htdocs/mediawiki # sqlite3 ../../data/my_wiki.sqlite 
SQLite version 3.7.12.1 2012-05-22 02:45:53
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>

See also