Migrate blog from Blogspot to jekyll without losing Google ranking

Published: by

A step by step guide to migrating from blogspot to jekyll.

  • Export Blogspot data
    https://support.google.com/blogger/answer/41387?hl=en
    

    I saved it at this location: /Users/arastogi/Downloads/blog-09-28-2017.xml

  • Import from Blogspot provided XML: Run this command from the root of your newly created jekyll website. Fix source value in the script.
    ruby -rubygems -e 'require "jekyll-import";
      JekyllImport::Importers::Blogger.run({
        "source"                => "/Users/arastogi/Downloads/blog-09-28-2017.xml",
        "no-blogger-info"       => false, # not to leave blogger-URL info (id and old URL) in the front matter
        "replace-internal-link" => false, # replace internal links using the post_url liquid tag.
      })'
    
  • Fix redirects: This is to make sure that users using the old URL scheme get redirected to the current posts. I am using gsed because MacOS's sed doesn't interpet \n as a newline.
    cd _posts/
    BLOG_DOMAIN="abhijeetr.com"
    ls | while read f; do gsed -i "s/\(blogger_orig_url: .*$BLOG_DOMAIN\)\(.*\)/\1\2\nredirect_from:\n  - \2/" $f; done
    
  • Install html to markdown converter
    sudo npm install h2m -g
    
  • Convert the current html posts to markdown
    cd _posts
    ls | while read f; do h2m $f > `echo $f | sed 's/.html$/\.md/'` & done
    
  • Delete duplicate .html files as we've already created markdown files in the previous command.
    rm _posts/*.html
    
  • If you don't want HTTP redirects and want to do redirects at the reverse proxy like I opted for, use this.
    cd _posts
    ls | while read f; do
    blogspot=`echo $f| sed 's/\(....\)-\(..\)-..-\(.*\).md/\1\/\2\/\3.html/'`
    jekyll=`echo $f| sed 's/\(....\)-\(..\)-\(..\)-\(.*\).md/blog\/\1\/\2\/\3\/\4\//'`
    echo location /$blogspot "{  proxy_pass https://localhost/blog/$jekyll; } "; done
    

    This outputs the location blocks that I can use in nginx like this.

    server {
     listen 443 ssl;
     listen [::]:443 ssl;
     server_name  blog.abhi.host;
     location /.well-known {
        root /var/www/html;
     }
     location /2016/11/logstashelasticsearch-best-way-to.html {  proxy_pass https://localhost/blog/blog/2016/11/24/logstashelasticsearch-best-way-to/; }
     location / {
        proxy_pass https://localhost/blog/;
    
     }
    }