5 Lessons in Storytelling With Data
It has been a while since I have sat down to write. Honestly, it is a combination of things: busyness, writer's block to a certain extent, and also coming up with a catchy enough topic that I thought would resonate with people. Leave it to a book to inspire me, and the one that has inspired this post shares the title of this blog. Storytelling With Data is an excellent book written by an ex-Google People Analytics director, Cole Nussbaumer Knaflic.
Before I get into some of the things I learned in the book, I want to take some time to introduce a new addition to the Industrial Insight family, Ben Still. Ben came highly recommended by someone I trust who told me "This is the guy you need to hire." I can already see why he told me that and am proud to have him as part of the team.
Ben was a Control Systems Engineer for Shaw Industries in their corporate automation and controls department and has a degree in Electrical Engineering from Clemson University. He has experience with automation systems on the plant floor as well as collecting process data and transforming it into informative visualizations, reports, and alerts. Some of the data systems and IoT tools he is experienced with are: Splunk, Ignition, MQTT, Kafka, Nifi, various historians, PLC programming, and multiple programming languages. He also enjoys being a part if the maker community, using devices like the Raspberry Pi, Arduino-based microcontrollers, and 3D printers to solve problems for fun and the Plant floor.
Now, onto some lessons learned in the book. I will point out 5 key takeaways (not so ironically, they are also chapters in the book).
Context - who, what, and how
I spend a lot of time working around context in data. Most of the data work we are doing today centers around OSIsoft's PI System and we are using PI Asset Framework to add context to a flat data structure. However, Cole has a different take on context around reporting. She asks herself the following questions when providing context in a visual: Who are you communicating with? What are you communicating? Why are you communicating?
You also have to understand what action you want someone to take through your data analysis and communication of it. I am finding in many of the projects that we are working on, we have several audiences, and the visuals we use and what we communicate can be very different across the audiences. The visual we build for an operator on the plant floor is different for the visual we build for the management team, which is drastically different than what we build for the engineering team. Each audience has a goal out of consuming the information to be able to take action, so how can we show this data to them to inspire the right action, even if the action is nothing because all is well?
Who is going to consume this information?
What do you want them to do?
How will you convey the information so that the right people take the right action?
It was interesting to read in this chapter Cole's disdain for "food" charts (donut and pie charts). However, I believe that she is quite correct in stating this. Bar charts, stacked bar charts, and line graphs are often much more effective at communicating relative size. It is really hard to tell the difference between say 38% and 45% on a pie chart, but really easy to do on a bar chart.
I agree with her assessment that these visuals are probably the most effective for 95% of our work:
Text (both words and big, fat numbers that stand on their own)
Scatterplot or X-Y plot
Bar graphs (both plain and stacked)
Waterfall charts (I would lump ribbon charts in here too)
Most often in the time series data world that we live in, slopegraphs, waterfall charts, and area graphs come less into play.
One interesting point that she makes in the chapter around effective visuals is to NOT use multiple vertical axes on the same chart. I can't tell you how many PI Processbook screens I have seen with 5 or more pens on it with different scales for each pen. I don't know how you guys make any sense of that without hiding pens (which most of you do), but I almost never put more than 1 scale on the same line graph/trend. Once in a while, I will put 2 where it makes sense, but it is rare.
This was the most convicting chapter for me. In the last year, I have learned how to use both Tableau and Power BI, and I have tended to get really happy with the sheer amount of stuff I can cram on a screen. Colors everywhere, graphs of various types, and a mish-mash of things that make you wonder what I am even trying to convey. I have built things that are just visually overwhelming. So, I took away that sometimes less is more.
The more junk you put on the screen, the less effective your story can become. You essentially leave your audience playing "Where's Waldo?" at times to find the story you are trying to convey.
With that said, is it ok to build large, complex graphics? Of course, and for the right application, but make sure that it is neat, organized, and that you use the techniques directly below.
Focus your audience's attention
Much has been made of high-performance graphics in the automation and control industry for many years. I remember coming up in the beginning and then the heyday of computer-based Human Machine Interfaces for control systems and we used lots of colors (because they were so pretty!). We put alarms on EVERYTHING, and created visually overwhelming displays for operators to run the process by (but they were so pretty!). In this age of self-service analytics, I am seeing similar crimes being committed, and I am even committing them myself! Here is an article on some of the basic precepts of high-performance HMI graphics, which highlights a lot of philosophies I think we should be using in our world as well. The idea is really around focusing the audience's eyes on what they need to pay attention to and muting everything else that doesn't require attention.
Strategic use of colors, size changes, and bolding of fonts can help you focus an audience toward your message ( I would bet your eye saw the beginning of this sentence before you ever started reading down this far).
Here are some examples of a before and after of my own graphics where I use these techniques.
Below are two versions of a steam usage versus ambient temperature graphic, where I wanted to get the audience to work with me to investigate further. The story is that the steam usage goes down during warmer months of the year, and up in the colder months. Yet, on average, the ambient temperature has gone up over the last several years. This is a hypothesis that I want my customer to explore further as this could be a major cost savings for them.
Here are the two versions:
Here are the changes, which help convey the story better:
In version 1, there are lots of colors, you really can't see any relationships that I discussed above and nothing draws your eye.
In version 2, I used gray bars with a red trend line to highlight that both steam usage and ambient temperatures are rising.
I changed the scaling of the top graph to show the inverse relationship between ambient temperature and steam usage. You have to be very careful with changing scales like I did because it can over accentuate changes, but in this case, it was appropriate.
I eliminated the 4th and incomplete year on the steam usage and ambient temperature bar graphs (I only was showing data through 9 months of 2017, where I was showing 12 months of 2014-2016).
I changed the height of the bar graphs to accentuate the changes in steam usage and ambient temperature over time
Another example is where I wanted to show grading in a process and how it has changed over time. Again, two versions are shown below.
Here are the design changes I made to enhance the story being told:
In version 1, it looks like a clown threw up on the page - colors everywhere. Questions I am sure you would ask would be: Where should I look? What are you trying to tell me?
In version 2, I used red and green to highlight the bad and the good and used a muted gray for the less significant information
The blue arrow on the left shows that asset number 1 is performing much worse over time
The blue arrow in the right shows that asset 9 has been performing poorly and getting worse over time
I left-justified the title and used a larger font in bold so that it was clear what you are looking at
I want the user to look harder into the performance of certain assets. Why are some bad and getting worse? Why are some that used to be performing well, now not performing so well? Why are some seemingly performing better over time? This should drive us to start looking deeper into these pieces of equipment to drive better performance. I think the changes that I made will make the audience ask the same questions and want to investigate further.
Think like a designer
This is a difficult concept for those that have come from a technical perspective. It is easy to think "I am an engineer and not an artist!" However, we need to step back and look at what we have created and see if it tells the story that we see in the data. Does it convey the right message to the right audience at the right time? Is it pleasing to the eye or is it overwhelming? Without having to explain everything, can my audience understand what the graphic is telling them?
One of the basic things that I often miss is to put what is most important in the top left of the dashboard/visual. In a fair amount of the world, we read from the top left, then to the right, then zig-zag down, as it is our reading pattern. As you can see in the eye tracking study below by Tableau, this can be changed by visual design and you can lead your audiences to certain places in your visuals, but almost everyone starts out at the top left. Sometimes, telling your audience what they are about to see (i.e. title or other identifying information) may be best at the top left, and other times critical data is put at the top left or just on the top of the page in general.
Tableau has put out several great studies/blogs on building effective dashboards and eye tracking. Just email me if you would like copies of them as I am not sure where I found them anymore. If you would like to read the blog about eye tracking, you can do so here. It is fascinating.
Reading this book has definitely changed my thinking toward how I not only design data visuals for our clients, but also how the analytics and data structures that drive the visuals are done and how they are organized. Design is a critical piece of our effectiveness of communicating with data and telling a story with it. I am still learning and will continue to study on this topic, but I wanted to share some things that can help each of you in your daily activities as I am on my own journey.