MediaWiki HSL Migration: Difference between revisions

From Traxel Wiki
Jump to navigation Jump to search
Line 31: Line 31:
awk '{print "curl -sO"$1}' < ../file-full-res-paths.txt
awk '{print "curl -sO"$1}' < ../file-full-res-paths.txt
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./ | sh
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat
#!/usr/bin/env perl
while (<>) {
    $a = $_;
    if ($a) {
        print "curl -s '".$a."'";
        print " | grep 'Full resolution'";
        print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\'';
        print " >>\n";

Revision as of 23:54, 30 March 2024

2024-03-30 Attempt

Fixing Broken Categories

cd /var/www/mediawiki/maintenance/
php update.php # this didn't do it
php rebuildall.php # this worked

Perl Regex for Page Names

perl -ne 'print "$2\n" while m|<a href="[^"]*"( class="mw-redirect")* title="[^"]*">([^<>]*)</|g' < pages-02.html

Perl Regex for File Links

a href="/wiki/File:01mendel-ABS_solid-bed-height-spacer-31m.jpg"
cat files-*.html | perl -ne 'print "$1\n" while m|a href="(File:[^"]*)"|g'

Handling Files

  1. Hit the /wiki/File:... page
  2. grep for "Full resolution"
  3. get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal"
  4. Use the second curl bit below to get the files
  5. Go to the All Pages special page and export the Files category.
    1. Import those pages to the new wiki.
  6. Upload the files, ignoring warnings. (maybe find a way to bulk this, but shouldn't be more than an hour of work)
curl -s '' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
awk '{print "curl -sO"$1}' < ../file-full-res-paths.txt
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./ | sh
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat 
#!/usr/bin/env perl

while (<>) {
    $a = $_;
    if ($a) {
        print "curl -s '".$a."'";
        print " | grep 'Full resolution'";
        print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\'';
        print " >>\n";