2024-03-30 Attempt
Fixing Broken Categories
cd /var/www/mediawiki/maintenance/
php update.php # this didn't do it
php rebuildall.php # this worked
Perl Regex for Page Names
perl -ne 'print "$2\n" while m|<a href="[^"]*"( class="mw-redirect")* title="[^"]*">([^<>]*)</|g' < pages-02.html
Perl Regex for File Links
a href="/wiki/File:01mendel-ABS_solid-bed-height-spacer-31m.jpg"
cat files-*.html | perl -ne 'print "$1\n" while m|a href="(File:[^"]*)"|g'
Handling Files
- Hit the /wiki/File:... page
- grep for "Full resolution"
- That didn't work. Try this:
<div class="fullMedia"><a href="/w/images/6/63/3d_Printing_BYOF.jpg" class="internal" title="3d Printing BYOF.jpg">3d_Printing_BYOF.jpg</a> <span class="fileInfo">(482 × 541 pixels, file size: 80 KB, MIME type: image/jpeg)</span>
- get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal"
- Use the second curl bit below to get the files
- Go to the All Pages special page and export the Files category.
- Import those pages to the new wiki.
- Upload the files, ignoring warnings. (maybe find a way to bulk this, but shouldn't be more than an hour of work)
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
awk '{print "curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths.txt
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./file-get-commands.pl | sh
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-get-commands.pl
#!/usr/bin/env perl
while (<>) {
chomp;
$a = $_;
if ($a) {
print "curl -s 'https://wiki.heatsynclabs.org".$a."'";
print " | grep 'Full resolution'";
print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\'';
print " >> test.sh\n";
}
}