A couple weeks ago, I completed a couple web maintenance projects I’ve been meaning to tackle.
I (finally) secured many of my DoOO websites using this guide from Reclaim Hosting (thanks Tim!). Now keeganslw.com, GOBLIN, eXperiencePlay, etc. will automatically serve encrypted https:// links instead of http://. I’m really EXCITED about completing this project as Google uses https:// in site rankings and Chrome often shows http:// sites as “insecure.” (Goodness, the ways in which Google rules the web has no end. :P) Anyways, I feel better about accessing my websites now.
WordPress to Static HTML Archiving
I downloaded an old WordPress website I only briefly used years ago (tlc.keeganslw.net) with the SiteSucker app. Using SiteSucker yielded HTML, CSS, JS, and asset files from my php-based WordPress website. More importantly, I was able to decrease the size of my site from ~65MB to 4.5MB! That feels awesome because 60MB has been reclaimed on my web server!
A Bit Of Troubleshooting
No index.html File At Root
I did run into a few problems during this process. First, when I used SiteSucker to only download files “2 levels deep,” it didn’t include an index.html file at the root folder of the website. Instead the proper index.html file was located one folder deep. This meant that when I visited tlc.keeganslw.net, it would load a page like so:
Rather than spending a bunch of time rewriting portions of the code in the proper index.html file I just created a new index.html file at the root folder using code from this website to redirect visitors to the right index.html file. Preview that code here:
Therefore, currently if you navigate to tlc.keeganslw.net, you will be redirected to tlc.keeganslw.net/teaching-learning-conference where the proper index.html file will be accessed and the website should display properly.
File Permissions On Server
The second problem I had was needing to go through the website files on the web server and change the access permissions of each. This problem occurred when I uploaded the website, every file defaulted to access permission values of 600 (meaning each file was not readable by any visitor). After I modified the permissions to 644, the website became accessible to the world. This process wasn’t a huge deal, it just took a few minutes of time.
Deleting Comments & Footer Sections
Like the Mobile Blogging & Scholarship website, there was some information in the footer (contact info) that I wanted to remove for the archived copy of tlc.keeganslw.net. So, because I deleted the installation of WordPress at tlc.keeganslw.net after I used the Sitesucker app, I had to open the index.html file in Atom, search for this contact info, and delete the code located in file. After modification, I’m happy with the state of this archived website:
I recognize that this post only covers a very small example of using SiteSucker to convert a WordPress website to static HTML. So, if you’re hungry for more, here are some larger WordPress archiving projects folks have pursued and written about:
- Archiving Old WordPress Sites as Static HTML by Alan Levine
- Get SiteSucker, Sucker by Jim Groom
- A Web Diet: Converting WordPress Sites Over to Static Sites by Adam Croom
Let me know if you have any questions about any of this. Happy archiving! 😀