I finally got the time to move my blog from Wordpress to another platform. Wordpress is great for casual blogs, but if one needs to write mathematical expressions or publish any code the support is abysmal - bizarre ad-hoc markup that constantly gets broken by the visual editor.
Hence, after reading this, I moved to Pelican and now host the static html on GitHub.
pelican-import
helped with the extraction of the old blog, but this little
script has a lot of drawbacks. First of all, it did not work in the default
rst
format, which is not that bad given that I prefer Markdown. However, even
in Markdown it had issues, especially with images. I do not think that I will
ever try to correct all the issues, so here are the workaround that I employed.
Firstly, all the \(\LaTeX\) and sourcecode
tags were not translated. I
used the following hackish solution for that problem ran over the content
folder.
perl -pi -e 's/\[sourcecode\]/\`\`\`/g' *
perl -pi -e 's/\[\/sourcecode\]/\`\`\`/g' *
perl -pi -e 's/\[sourcecode language="python"\]/\`\`\`python/g' *
perl -pi -e 's/\\\$latex/\$/g' *
perl -pi -e 's/\\\$/\$/g' *
perl -pi -e 's/\\\\n/\\n/g' *
perl -pi -e 's/\\\^/\^/g' *
perl -pi -e 's/\\\*/\*/g' *
perl -pi -e 's/\\_/_/g' *
perl -pi -e 's/\\\\/\\/g' *
perl -pi -e 's/\\#/#/g' *
Then I had to correct all the badly escaped unicode slugs. Some of my post are
written in Cyrillic, which is escaped in urls. The problem is that
pelican-import
failed to notice the difference and then pelican
produced links that
do not correspond to the filenames. Another hackish solution for this problem:
for i in `dir -1`;
do
name=$(perl -MURI::Escape -e 'print uri_unescape($ARGV[0]);' `grep Slug ./$i | cut -d" " -f2-`);
mv ./$i $name.md;
done;
for i in `dir -1`;
do
old=$(grep "^Slug: " $i);
new=$(perl -MURI::Escape -e 'print uri_unescape($ARGV[0]);' `echo $old | cut -d" " -f2-`);
perl -pi -e "s/$old/Slug: $new/" $i;
done;
Finally, I modified the categories and tags, because pelican
does not support
subcategories. In addition, I corrected the Author
field with
perl -pi -e 's/Author: stefankr/Author: Stefan Krastanov/g' *
.
After addressing all the problems with the content, I had to try to transfer
all the comments. I dislike the idea of moving from one walled garden to another
(like Disqus) so I deployed the great simple comment server
Juvia and imported all the Wordpress
comments in it (there were hiccups, but all issues were reported to the
bugtracker). I had to modify slightly the embedded javascript to address the
fact that the new pages have .html
as a suffix but that was straightforward
to do with topic_key : (location.pathname.indexOf(".html") == -1) ? location.pathname : location.pathname.slice(0, -5)
.
The expression had to be like this to ensure that both the versions with and
without the ".html"
suffix work.
Anyhow… Welcome to my blog.