There and back again

Written by Darren Bruning on
There and back again

I was scrolling through Twitter and I saw this tweet promoted:

It looked like fun so I decided to have a go. I ended up releasing my entry as https://thereandbackagain.nz (go try it out!) and this blog post describes how I tackled the problem.

2013 version

The competition is basically to improve upon a visualisation from 2013:

… which has:

  • A map of NZ on the left
  • Zoom & pan the map with mouse
  • Click on a region to select it:
    • Lines are drawn on the map showing inflows and outflows to/from the selected region
    • The data area on the right-hand side shows a bar graph of commuting flows for that area.

What's nice about that visualisation?

  • It’s interactive

What’s not so nice:

  • It only lets you drill straight into the detailed data for a single region, there is no “overview” to allow you to detect patterns or choose something interesting to drill into.

What can we do better?

After letting the idea bounce around in my head for a few days, I thought it would be interesting to try to improve on a few different aspects:

  1. Try to show the data across the whole of NZ at once, to allow patterns in the data to be detected
  2. Use 3D graphics to make the visualisation more engaging and interesting
  3. Allow some some sort of “drill-down” to display detailed data.

How to tackle it?

I broke the problem down into chunks:

  1. How to display the map of NZ?
  2. How to display the data in 3D?
  3. How to detect click on a particular region and display some detailed data as a drill-down?

Displaying the NZ map

I chose three.js javascript library to display the data, mostly because it looks neat and I wanted to have a chance to experiment with it.

StatsNZ published a link to the datasets to be used in the competition. Both the work and education datasets were available as either GIS data or good old-fashioned CSV data. I used the CSV version data, so that I could download the map separately first and re-use the map between the two datasets. The map data (for Census 2018, broken down by SA2 region) was available separately from the same data portal.

I wanted to do 2 things with the map data:

  1. Display a “wireframe” showing the region boundaries
  2. Eventually, detect clicks inside the regions.

After some experimentation, the most performant way I could find to accomplish that was:

  1. Convert the map data into topojson format, which is a space-efficient way to represent the data.
  2. Simplify the map data, because it contained a lot of low-level detail that wasn’t needed for this visualisation
  3. From the web application, download the topojson from the server and create 2 separate 3D representations of it:
    1. Firstly, show only the edges, for the visual map
    2. Secondly, create polygons from the same data which aren’t initially displayed but which are kept for hit-testing of mouse clicks later.

Three.js ships with several different ways to navigate the visualisation. I ended up using the “MapControls” (unsurprisingly). I couldn’t quite get my controls to work as well as the three.js demo, and I wasn’t sure why – until I eventually figured out that my data was laid out in the XY plane (with Z always being zero), whereas the three.js demo data was laid out in the YZ plane. Once I figured out that difference, I was able to just tell the camera object in the scene that Z represented “up”, and all was good.

Displaying the data in 3D

Where the 2013 version used lines between regions to represent data, I wanted to try to use “hops” between source & destination, like you used to see on old-fashioned airline route maps. After some googling I figured out that I would be able to accomplish this by using a torus (doughnut shape) which three.js could create for me with any radius and tube thickness I needed. The basic approach is:

  1. Parse a line from the downloaded CSV file
  2. Take the XY coordinate for the midpoint of the “from” and the “to” region
  3. Calculate the distance between those 2 points (Pythagoras) and the relative angle between them (atan2)
  4. Construct a half-torus, with:
    1. Radius equal to half the distance between the two points
    2. Tube thickness proportional to the number of people travelling between those 2 points (the exact proportionality factor determined by experimentation)
    3. Rotated 90degrees around the x-axis to “stand it up”
    4. Rotated the appropriate angle around the z-axis to give it the same angle as the 2 points
    5. Positioned at the midpoint of the 2 points

(If you’re a developer, this gist of that method might be easier to read:)

This gave a nice visualisation, but performance was terrible. I had to make some performance tweaks:

  1. Merge the geometry of all the toruses (torii?) into one giant geometry. It’s the same number of edges and triangles, but a lot more performant in three.js (presumably because it’s a single call to webgl under the hood)
    1. (Actually not just one giant geometry – I ended up doing it in batches of 300 toruses, because then the data appears progressively on-screen, acting as a kind of built-in progress bar.)
  2. Simplify the number of faces on each torus. This turned out to be useful later on too, where the “chunky” toruses give a better idea of movement when they’re spinning
  3. Sort the dataset so that the largest “hops” (by number of people, not distance) was at the top of the list, and then cutoff the visualisation at a certain point. This is partly to reduce the visual “clutter” of thousands of small datapoints on the map, but mostly to improve drawing performance. I added a “Data detail level” selector which defaults to a “medium” level of detail, but with “low” and “high” options so that people on slower/older devices can still view the data, and people on higher-end machines can see more.

At this point we’ve got a pretty good visualisation:

(Visualisation of people commuting for study in the Auckland region)

Detecting a click on a particular region to display metadata

Three.js has a built-in raycaster which you can use to project a mouseclick into the scene & find any polygons that it intersects. This was where I used the topojson data to create an array of polygons representing the regions. It took a while to figure out how to transform the topojson data into polygons (with holes), but seemed quite reliable once done.

When a user clicks on a region, four things happen:

  1. The yellow “hops” across the whole country are hidden
  2. Only hops into or out of the selected region are shown – but all of them are shown, even ones that are too small and miss the cut-off to be shown on the whole-country visualisation
  3. The hops are colour-coded: red means into the selected region, and blue means out of the selected region.
  4. The hops also rotate to indicate direction (because we don’t want to rely just on colour, for vision-impaired users)
  5. A table on the right-hand side shows a textual representation of the data.

Of these, the most interesting to implement was the rotation of the hoops to indicate direction. I couldn’t use a half-torus shape anymore; I needed to use a full torus. But I didn’t want to see the bottom half of each torus, so I used a clipping plane to remove the bottom half of each torus. The visualisation is quite striking:

Next steps

Overall I’m really happy with how the project turned out. One of things I enjoyed was having a hard deadline to work to – often with Agile projects, you just keep going while you’re still adding value, so you don’t know from iteration to iteration when the project will finish. That’s generally a good thing from a business value perspective, but means you never really know when you’re going to be “done”!

Having said that, if I had more time, I would have liked to:

  1. Add a sub-visualisation for the travel inside each region. The dataset contains that data, and my entry shows the data in the table on the right-hand side when you click on a particular region. It can’t really be visualised with the “hop” concept, because the “from” point is the same as the “to” point. I would like to try out extruding the regions on the Z axis, and having the height of each region’s extrusion represent the number of people travelling to work/study inside that region. This would be visible if you select the “Study at home” (or “Work at home”) option from the “Mode of transport” selector.
  2. Tweak performance a little more. The data still takes too long to load, and the status bar sits at 0% for most of that.
  3. Check the usability of the green-on-black design. Personally, I really like the design aesthetic (it echoes back to my roots on an Apple ][e) but I’m not sure that there is enough contrast for visually-impaired users.
  4. Add some instructions. The lack of instructions was a design choice, and I was hoping that no instructions would encourage people to try things out (e.g. click/drag the map, click on regions) - but watching people actually use it, I found that some people need instructions almost as a “permission” to try things out.

Summary

I really enjoyed this project and I hope StatsNZ promote the entries so that people have a chance to play with it!

I’m available for data visualisation work - if your company has a lot of data and no good way to make sense of it, please get in touch: darren@bruning.net.nz