The Core Dump

Lower your shields and surrender your ships

A marijuana grow house in Nevada. Image from weedrush.news21.com. Credit: Kathryn Boyd-Batstone

[By Nic Lindh on Friday, 09 October 2015]

Building a static site for an investigative journalism project

Things to consider when planning to build a site on a compressed time table.

I spend my summers helping create the website for an investigative project called News21. Each year a team of Fellows from universities around the U.S. dive deep into a topic and the resulting content is then syndicated with major partners like The Washington Post, USA Today, and many others. But the content also needs a permanent home on the Web, so we build a site.

The site contains images, video, interactive infographics, and of course the stories themselves. It must be attractive and innovative. It also has to be built on a very compressed schedule, with actual page production limited to a few weeks and the site functionality and design around 10 weeks.

And then it needs to stand up to bursts of heavy traffic.

And then it needs to remain available for many years to come.

Easy. Deep breath.

If you have your hand up saying, “Oh, oh, you should use a static site generator for this!”, you just earned a cookie.

Static site generator

For the last two projects (Gun Wars, on gun culture in America and Weed Rush, on the legalization of marijuana in America) we used the static site generator Jekyll to create the project. There are many static site generators out there, and I think most of them could do the job, but I had familiarity with Jekyll from other projects and since it has the support of and has been battle-tested by GitHub it’s a pretty safe choice.

There are several benefits to a static site:

  • The site can be hosted almost anywhere and moved very easily. Going from development server to production is literally a matter of copying files.

  • There’s no admin backend on the site so there’s nothing for an attacker to try to break into. This helps server admins sleep better.

  • Page production is much faster since you’re simply manipulating files—there’s no backend dashboard to click through to accomplish tasks. Once a producer gets comfortable manipulating files, the production speed increase is remarkable.

  • Jekyll is much less opinionated than WordPress, so there’s less of a feeling of going “against the grain” when building features.

But there’s no free lunch, so there are disadvantages:

  • A steep learning curve. For producers who are only used to going through a dashboard like with WordPress, Tumblr and Drupal, it can be forbidding at first, so good training is essential. (Though I’d argue that a Web producer should understand the basics of working with SFTP, the command line and Git.)

  • When several people are in a file system at the same time operating under stress it’s easy to accidentally overwrite each others’ changes, so good communication is crucial. (It would obviously be better to have everybody work in their own Git repo but that introduces a lot of technical and workflow overhead when the producers aren’t highly technical.)

Designing the workflow

The site was created with the explicit goal of making production as fast as possible. This meant first off to separate content from presentation. The content—images, videos, story—on each page had to make no assumptions about the final presentation. This way producers could build the pages while designers changed the final look of the site in tandem.

This meant creating shortcodes for all multimedia content, so a producer would never insert an image, say, with a raw <img src="/images/parallax/hello.jpg" /> HTML tag; instead, all multimedia elements were called in through Jekyll includes.

The Jekyll include would then know in which directory parallax images lived and write out the actual code to put the image on the page.

This way the actual multimedia presentation could change right up to launch without having to go back and touch any of the stories.

Dont’t Repeat Yourself

Any website repeats a lot of content and under deadline pressure it’s very easy to forget a spot. So the project was built as much as possible on the principle of DRY (Don’t Repeat Yourself).

Anything that goes on more than one page should exist in a data file and be read in, never repeated on the site itself.

Putting in the thought ahead of time to factor out anything that will repeat on the site and centralizing it will pay off at crunch time. Jekyll’s _data directory support makes this easy.

Here are some of the things abstracted out for Weed Rush:

  • Fellow bios for story footers
  • In-page navigation
  • Page URLs. (Since URLs change when—not if—the story’s title is edited and stories are linked to from many places, having a central URL repository streamlines the workflow significantly.)
  • Project title
  • Project blurb
  • Publish date
  • Generic social image for Twitter cards and Facebook

Staying up under load

By its very nature a static site will be able to handle more traffic than one that has to be processed for each request, so there’s less risk of the server keeling over. Though enough traffic will choke any server.

To help take the load off the server, the server hosting Weed Rush was put behind CloudFlare, which, headscratchingly, provides a free Content Delivery Network that is very good.

Since the pages on Weed Rush are very heavy with images, we opted to upgrade to Cloudflare’s Pro plan which, among other things, provides an extra layer of image optimization for different device sizes and lazy loads images on slow connections. Well worth it to make sure the site felt (reasonably) fast despite all the assets.

Use the source, Luke

If you want to pick apart how the site works, you can clone it from GitHub and run it on your own computer. Feedback and constructive criticism is welcome.

« How to learn things you’re not interested in

 »


You like to read about technology? What a coincidence—I like to write about technology!

The pro market, the nerds, and the vision

Apple’s neglect of the pro market is causing a lot of gnashing of teeth in Apple-nerd circles, but it’s true to Apple’s vision.

What to expect when you’re expecting a Hackintosh

There is unrest in the Mac community about Apple’s commitment to the platform. Some are turning their eyes to building a Hackintosh to get the kind of computer Apple doesn’t provide. Here’s what it’s like to run a Hackintosh.

The car is going digital and that’s a good thing

Car nerds are dealing with some cognitive dissonance as car technology changes.

Review: Kindle Oasis

The Oasis is Amazon’s best e-ink reader to date, but it’s not good enough for the price.

“Tea, Earl Grey, hot”

Nic buys an Amazon Echo and is indubitably happy with the fantasy star ship in his head.

It’s a content blocker, not an ad blocker

The problem isn’t ads. The problem is being stalked like an animal across the internet.

Review: Synology DS416j

The DS416j is a nice NAS for light home use. Just don’t expect raw power.

(Nerd Note) Moving to GitHub Pages

The Core Dump is moving to GitHub Pages. This is a good thing, most likely.

Apple Watch, six months in

Thoughts on Apple Watch after half a year of daily usage.

Magical thinking about encryption and privacy

Predictably, the Paris attacks brought the anti-encryption crowd back out of the woodwork. They're at best being willfully disingenuous.

Building a static site for an investigative journalism project

Things to consider when planning to build a site on a compressed time table.

Digital hygiene for online security and safety

Nic provides some basic not-too-paranoid tips for securing your digital life.

How to install Jekyll on Amazon Linux

Installing Jekyll on an EC2 Amazon Linux AMI is easy. Here are the steps.

Will Apple Watch be a success?

After wearing the watch for over a month, Nic has thoughts on its future. Spoiler: Depends on how you define success.

Let’s all chill out about the iPad sales numbers

Turns out “it's just a big iPhone” is a stroke of genius.

Tech terms you might be misusing

Some technical terms still confuse people who should know better, like journalists.

Naked root domain with Amazon S3 without using Route 53

How to host a static site on Amazon S3 with an apex domain without using Amazon’s Route 53.

New technology requires new thinking

People fear change, so new technology is used as as a faster version of the old. This makes technologists sad.

An HTML, CSS and JavaScript lesson plan

Nic provides a lesson plan for teaching total beginners HTML, CSS and JavaScript.

The glanceable wrist in your future

Nic loves his Pebble and looks forward to the Apple Watch, but realizes he’s in the minority.

It's the words, stupid

Nic loves books, but he loves their content more.

Our technology is bad and we should feel bad

Nic is worried about the fragile state of our technology and thinks you should be as well.

The WATCH is nigh, and I don't get it

Nic tries to understand the WATCH. It doesn't go well.

Apple might enter the home integration field

Nic thinks home integration could be Apple’s next major category. Read on to find out why.

An Apple ebook reader would be nice

Nic is frustrated with his Kindle and would love to see Apple make an e-ink reader.

The iPhone, devourer of technologies

The iPhone was announced Jan. 9, 2007. It now occupies a huge chunk of Nic’s life.

The A7 processor is your friend

Nic is very impressed with the speed of the iPhone 5S and iPad Air.

The 2013 Nexus 7

Nic buys a Nexus 7 to test the Android waters.

On azcentral.com outsourcing comments to Facebook

Nic outlines some of the risks of ceding comments on news stories to Facebook.

Lion and the angst of the greybeards

Nic is bemused by the sturm und drang surrounding the iOS-ification of Mac OS X.

Web publishing made easy

Web publishing used to require heavy-duty nerditry, but no longer.

How to create an e-book

Nic is creating an e-book. He shares what he’s learned so far.