MediaWiki HSL Migration: Difference between revisions
Jump to navigation
Jump to search
(8 intermediate revisions by the same user not shown) | |||
Line 21: | Line 21: | ||
# Hit the /wiki/File:... page | # Hit the /wiki/File:... page | ||
# grep for "Full resolution" | # grep for "Full resolution" | ||
#* That didn't work. Try this: | |||
#* <pre><div class="fullMedia"><a href="/w/images/6/63/3d_Printing_BYOF.jpg" class="internal" title="3d Printing BYOF.jpg">3d_Printing_BYOF.jpg</a> <span class="fileInfo">(482 × 541 pixels, file size: 80 KB, MIME type: image/jpeg)</span></pre> | |||
#* <pre><div class="fullMedia"><a href="/w/images/e/e5/3d_Printing_Donation_Box.jpg" class="internal" title="3d Printing Donation Box.jpg">Full resolution</a> <span class="fileInfo">(1,177 × 753 pixels, file size: 247 KB, MIME type: image/jpeg)</span></pre> | |||
# get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal" | # get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal" | ||
# Use the second curl bit below to get the files | |||
# Go to the All Pages special page and export the Files category. | # Go to the All Pages special page and export the Files category. | ||
## Import those pages to the new wiki. | ## Import those pages to the new wiki. | ||
# Upload the files, ignoring warnings. (maybe find a way to bulk this) | # Upload the files, ignoring warnings. (maybe find a way to bulk this, but shouldn't be more than an hour of work) | ||
<pre> | <pre> | ||
curl 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g' | curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g' | ||
</pre> | |||
<pre> | |||
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'div class="fullMedia"' | grep -v 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g' | |||
</pre> | |||
<pre> | |||
awk '{print "curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths.txt | |||
awk '{print "echo "$1" && curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths-nofull.txt | |||
</pre> | |||
<pre> | |||
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./file-get-commands.pl | sh | |||
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-get-commands.pl | |||
#!/usr/bin/env perl | |||
while (<>) { | |||
chomp; | |||
$a = $_; | |||
if ($a) { | |||
print "curl -s 'https://wiki.heatsynclabs.org".$a."'"; | |||
print " | grep 'Full resolution'"; | |||
print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\''; | |||
print " >> test.sh\n"; | |||
} | |||
} | |||
</pre> | </pre> |
Latest revision as of 00:52, 31 March 2024
2024-03-30 Attempt
Fixing Broken Categories
cd /var/www/mediawiki/maintenance/ php update.php # this didn't do it php rebuildall.php # this worked
Perl Regex for Page Names
perl -ne 'print "$2\n" while m|<a href="[^"]*"( class="mw-redirect")* title="[^"]*">([^<>]*)</|g' < pages-02.html
Perl Regex for File Links
a href="/wiki/File:01mendel-ABS_solid-bed-height-spacer-31m.jpg"
cat files-*.html | perl -ne 'print "$1\n" while m|a href="(File:[^"]*)"|g'
Handling Files
- Hit the /wiki/File:... page
- grep for "Full resolution"
- That didn't work. Try this:
<div class="fullMedia"><a href="/w/images/6/63/3d_Printing_BYOF.jpg" class="internal" title="3d Printing BYOF.jpg">3d_Printing_BYOF.jpg</a> <span class="fileInfo">(482 × 541 pixels, file size: 80 KB, MIME type: image/jpeg)</span>
<div class="fullMedia"><a href="/w/images/e/e5/3d_Printing_Donation_Box.jpg" class="internal" title="3d Printing Donation Box.jpg">Full resolution</a> <span class="fileInfo">(1,177 × 753 pixels, file size: 247 KB, MIME type: image/jpeg)</span>
- get the URL from: a href="/w/images/1/1c/zoom-into-rectangle.png" class="internal"
- Use the second curl bit below to get the files
- Go to the All Pages special page and export the Files category.
- Import those pages to the new wiki.
- Upload the files, ignoring warnings. (maybe find a way to bulk this, but shouldn't be more than an hour of work)
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
curl -s 'https://wiki.heatsynclabs.org/wiki/File:zoom-into-rectangle.png' | grep 'div class="fullMedia"' | grep -v 'Full resolution' | perl -ne 'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g'
awk '{print "curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths.txt awk '{print "echo "$1" && curl -sO https://wiki.heatsynclabs.org"$1}' < ../file-full-res-paths-nofull.txt
bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-paths.txt | ./file-get-commands.pl | sh bob@lap-2021:~/Documents/hacking/hsl/wiki-migrate$ cat file-get-commands.pl #!/usr/bin/env perl while (<>) { chomp; $a = $_; if ($a) { print "curl -s 'https://wiki.heatsynclabs.org".$a."'"; print " | grep 'Full resolution'"; print ' | perl -ne \'print "$1\n" while m|a href="(/w/images/[^"]*)" class="internal"|g;\''; print " >> test.sh\n"; } }