MediaWiki HSL Migration

From Traxel Wiki
Revision as of 00:52, 31 March 2024 by RobertBushman (talk | contribs) (→‎Handling Files)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

2024-03-30 Attempt

Fixing Broken Categories

cd /var/www/mediawiki/maintenance/
php update.php # this didn't do it
php rebuildall.php # this worked

Perl Regex for Page Names

perl -ne 'print "$2\n" while m|<a href="[^"]*"( class="mw-redirect")* title="[^"]*">([^<>]*)</|g' < pages-02.html

Perl Regex for File Links

a href="/wiki/File:01mendel-ABS_solid-bed-height-spacer-31m.jpg"
cat files-*.html | perl -ne 'print "$1\n" while m|a href="(File:[^"]*)"|g'

Handling Files

  1. Hit the /wiki/File:... page
  2. grep for "Full resolution"
    • That didn't work. Try this:
    • <div class="fullMedia"><a href="/w/images/6/63/3d_Printing_BYOF.jpg" class="internal" title="3d Printing BYOF.jpg">3d_Printing_BYOF.jpg</a>‎ <span class="fileInfo">(482 × 541 pixels, file size: 80 KB, MIME type: image/jpeg)</span>
    • <div class="fullMedia"><a href="/w/images/e/e5/3d_Printing_Donation_Box.jpg" class="internal" title="3d Printing Donation Box.jpg">Full resolution</a>‎ <span class="fileInfo">(1,177 × 753 pixels, file size: 247 KB, MIME type: image/jpeg)</span>
  3. get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal"
  4. Use the second curl bit below to get the files
  5. Go to the All Pages special page and export the Files category.
    1. Import those pages to the new wiki.
  6. Upload the files, ignoring warnings. (maybe find a way to bulk this, but shouldn't be more than an hour of work)
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'div class="fullMedia"' | grep -v 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
awk '{print "curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths.txt
awk '{print "echo "$1" && curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths-nofull.txt
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./file-get-commands.pl | sh
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-get-commands.pl 
#!/usr/bin/env perl

while (<>) {
    chomp;
    $a = $_;
    if ($a) {
        print "curl -s 'https://wiki.heatsynclabs.org".$a."'";
        print " | grep 'Full resolution'";
        print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\'';
        print " >> test.sh\n";
    }
}