The Core Dump

The Core Dump is the personal blog of Nic Lindh, a Swedish-American pixel-pusher living in Phoenix, Arizona.

Search
A marijuana grow house in Nevada. Image from weedrush.news21.com. Credit: Kathryn Boyd-Batstone
A marijuana grow house in Nevada. Image from weedrush.news21.com. Credit: Kathryn Boyd-Batstone

[By Nic Lindh on Friday, 09 October 2015]

Building a static site for an investigative journalism project

Things to consider when planning to build a site on a compressed time table.

I spend my summers helping create the website for an investigative project called News21. Each year a team of Fellows from universities around the U.S. dive deep into a topic and the resulting content is then syndicated with major partners like The Washington Post, USA Today, and many others. But the content also needs a permanent home on the Web, so we build a site.

The site contains images, video, interactive infographics, and of course the stories themselves. It must be attractive and innovative. It also has to be built on a very compressed schedule, with actual page production limited to a few weeks and the site functionality and design around 10 weeks.

And then it needs to stand up to bursts of heavy traffic.

And then it needs to remain available for many years to come.

Easy. Deep breath.

If you have your hand up saying, “Oh, oh, you should use a static site generator for this!”, you just earned a cookie.

Static site generator

For the last two projects (Gun Wars, on gun culture in America and Weed Rush, on the legalization of marijuana in America) we used the static site generator Jekyll to create the project. There are many static site generators out there, and I think most of them could do the job, but I had familiarity with Jekyll from other projects and since it has the support of and has been battle-tested by GitHub it’s a pretty safe choice.

There are several benefits to a static site:

But there’s no free lunch, so there are disadvantages:

Designing the workflow

The site was created with the explicit goal of making production as fast as possible. This meant first off to separate content from presentation. The content—images, videos, story—on each page had to make no assumptions about the final presentation. This way producers could build the pages while designers changed the final look of the site in tandem.

This meant creating shortcodes for all multimedia content, so a producer would never insert an image, say, with a raw <img src="/images/parallax/hello.jpg" /> HTML tag; instead, all multimedia elements were called in through Jekyll includes.

The Jekyll include would then know in which directory parallax images lived and write out the actual code to put the image on the page.

This way the actual multimedia presentation could change right up to launch without having to go back and touch any of the stories.

Dont’t Repeat Yourself

Any website repeats a lot of content and under deadline pressure it’s very easy to forget a spot. So the project was built as much as possible on the principle of DRY (Don’t Repeat Yourself).

Anything that goes on more than one page should exist in a data file and be read in, never repeated on the site itself.

Putting in the thought ahead of time to factor out anything that will repeat on the site and centralizing it will pay off at crunch time. Jekyll’s _data directory support makes this easy.

Here are some of the things abstracted out for Weed Rush:

Staying up under load

By its very nature a static site will be able to handle more traffic than one that has to be processed for each request, so there’s less risk of the server keeling over. Though enough traffic will choke any server.

To help take the load off the server, the server hosting Weed Rush was put behind CloudFlare, which, headscratchingly, provides a free Content Delivery Network that is very good.

Since the pages on Weed Rush are very heavy with images, we opted to upgrade to Cloudflare’s Pro plan which, among other things, provides an extra layer of image optimization for different device sizes and lazy loads images on slow connections. Well worth it to make sure the site felt (reasonably) fast despite all the assets.

Use the source, Luke

If you want to pick apart how the site works, you can clone it from GitHub and run it on your own computer. Feedback and constructive criticism is welcome.

You have thoughts? I’m @niclindh on Twitter and I want to know what you think.


« How to learn things you’re not interested in

Book roundup, part 20 »