For those of you interested in some of the maths behind drawing Sankey diagrams properly, you might want to read this article on ‘Programmatic Rendering of Directed, Weighted Graphs’ submitted for SVG Open 2003 by Philip A. Mansfield and Mark Ambachtsheer of SchemaSoft.

The authors consider Sankey diagrams as directed weighted graphs but they “can be difficult, time-consuming, and uninteresting to render by hand”. However, “Sankey diagrams do add an indisputable expressive power to a standard mathematical rendering of a graph…[and] when professionally constructed, Sankey diagrams represent flow in a manner … can be understood by anyone, instantly.”

Three diagrams are presented: a simple directed, weighted graph representing a candy factory, a pen-sketched B/W Sankey diagram, and the corresponding Sankey diagram in SVG format, created using data in XML format and XSLT style sheet transformation.

Candy factory: the raw materials quantities for the production of chocolate bars and caramel core candies are shown with a simple weighted, directed graph (Source: Mansfield/Ambachtsheer. Programmatic Rendering of Directed, Weighted Graphs, 2003)The same candy factory with a hand-drawn Sankey diagram (Source: Mansfield/Ambachtsheer. Programmatic Rendering of Directed, Weighted Graphs, 2003)Sankey diagram for the candy factory. A vector graphic has been created using XML and XSLT. (Source: Mansfield/Ambachtsheer. Programmatic Rendering of Directed, Weighted Graphs, 2003)

They also have some interesting details on graphical problems, such as overlay, edge layout, width of Sankey arrows in curves, etc. Basically all that stuff that developers of professional Sankey software tools have to cope with.

This is an interesting one: Saveen Reddy shows a Sankey diagram-like breakdown of bugs in a software development project. The term ‘bug’ is used “…very generically to describe any issue being tracked, not only defects in source code.”.

Example of a (Sankey-style) diagram for bug tracking in software development (source: Savee Reddy's Blog)

This does not fully classify as a Sankey diagram, I think, because the arrows don’t seem to reflect any quantities (number of bugs, time spent on bugs, …). But just like the diagram that showed the number of people having been accused and the turnout of the cases I showed here in June 2007, it visualizes a sequence of breakdowns, leading to decisions that are taken (dashed line arrows).

Now, anybody wants to check their bug tracking tool and show a similar diagram but with real numbers?

DOEs Energy Information Administration (EIA) produces a lot of energy statistics, and they often use Sankey diagrams to illustrate energy flows.

One of their Sankey diagrams that dates back to 1999 has an interesting two-part structure. It actually is made up from two Sankey diagrams, which are connected by one flow. Values are in quadrillion BTUs.

A Sankey diagram for energy production in the US 1999 showing the quantities created by utilities and by nonutility power producers. Values are in Quadrillion BTUs. Diagram originally from http://www.eia.doe.gov

The top part of the diagram shows electricity produced from various sources, losses along the production line, and the consumption of the electricity in the “Residential”, “Commercial” and “Industrial” sectors. This is structured very similarly to other Sankey diagrams EIA publishes annually (example).

The bottom part shows another Sankey diagram for electricity produced by ‘Nonutility Power Producers’. So what exactly are these NPPs?

A corporation, person, agency, authority, or other legal entity or instrumentality that owns electric generating capacity and is not an electric utility. Nonutility power producers include qualifying cogenerators, qualifying small power producers, and other nonutility generators (including independent power producers) without a designated franchised service area, and which do not file forms listed in the Code of Federal Regulations, Title 18, Part 141. (Source)

Half of the electricity produced by Nonutility Power Producers in 1999 was fed into the grid, while the other half was consumed on-site. I imagine these are typically larger industrial facilities, that have their own power generation. The fact that nuclear energy appears in this section does irritates me a little bit, but as this page explains, the reason is probably a nuclear reactor in a national research laboratory, that is accounted for here.

Reading on one of my favorite blogs actually made me take a harder stance on the Sankey diagram I presented in my last post. Following Kaiser’s attitude of making it better rather than only criticizing, I redesigned the Sankey diagram of phosphorus flows in the Peel-Harvey catchment area.

An improved version of the Sankey diagram for phosphorus flows shown in the last post. No differentiation between the individual P sources, but flows now branch off at a node.An improved version of the Sankey diagram for phosphorus flows shown in the last post. Differentiation between the individual P sources, and flows branch off at a node. Legend as in the original Sankey diagram.

In the first version I didn’t differentiate the various sources of phosphorus, but only used one color for the overall flow quantity. Introducing nodes dramatically improves the comprehensibility and the mass balance check for the flows branching off sideways. There is some redundancy in the labeling of the flows, but I left it to stick as close to the original Sankey diagram as possible.

The second Sankey diagram is even closer to the original one. I tried to match the colors as much, and also introduced a legend. Please note that, since I didn’t have access to the raw data, I just approximated the flow values. Because of the multi flow arrows, I decided to leave a border line at each arrow, and to put heads to the first two input flows (’fertiliser P input’ and ‘non fertiliser P input’) to better be able to distinguish them.

Kevin Deegan-Krause, an Associate Professor of Political Science at Wayne State University has contacted me with a fascinating idea and a unique example of a Sankey diagram. He writes: “I accidentally “reinvented” the Sankey wheel quite awhile ago for a rather odd usage: political party ebb and flow in E. Europe.”

A family tree of political parties, 1990-2008, presented by Kevin Deegan-Krause on his Pozorblog http://www.pozorblog.com/

This diagram he presented in a recent post on his Pozorblog shows how parliament seats were allocated to the different parties in the Slovakian parliament from 1990 to 2008, and how seats were shuffled during the six elections that took place during that time.

“As with energy, the number of party seats in parliament is a closed system and there are flows from some to others. This is a highly modified usage, of course, as we do not know in a precise way “where” votes go from one election to the next, so we just fudge at the day of election and just start over (or have invisible reallocations).”

Have a look, for example at the brown brown arrow: The HZDS party has clearly been on the loosing track. On the other hand SMER (shown in orange), a party that emerged as a spin-off from SDL (pink) in 2000, could gain a significant number of seats in the 2002 elections, when SDL couldn’t make it into Slovakian parliament any more. Others pretty much remained stable in their number of seats over the whole observation period.

I am not sure if Kevin uses the “official” colors of the parties (wouldn’t mind tossing my vote for the pink party ;) ), but a color clustering would also allow to visualize if left, or right, or liberal, or whatever political direction gets stronger or weaker. It would also be a possibility to see, if there is an overall trend in a certain group of countries, such as Eastern Europe, or the apparent growth of socialism in South America.

I am calling for votes on how to name this special type of Sankey diagram. If the first Sankey diagram was for steam power, why not a power Sankey diagram? ;)

User ‘taqua’ at jfree.org comments on another topic:

there is a fundamental difference between a *chart* and a *graph* or diagram.

A chart is a map of some data (like a city map, but for mass-data). It is a graphical visualization of tabular data. Charts are used for statistical purposes. Charts may be helpful to make mass data more understandable.

A graph is a graphical representation of a relationship between some objects or concepts. (In other words: A graph is a drawing that explains how something works or behaves.)

It is a common property of human languages, that terms get mixed, so you will find the word ‘chart’ in classical graph types, like ‘flow-chart’. Nonetheless, by sticking to the definitions above, it is easy to see that a flowchart is no chart at all - its a graph.

Taking this into consideration, a Sankey diagram can be considered both, a Sankey chart and a Sankey diagram. The quantities represented by the magnitude of the flow could also be shown as tabular data, the direction of the flow, given by the arrow orientation between two processes indicates a ‘from-to’-relationship.

While browsing through some of my older bookmarks I discovered this page of what seems to be an information portal of one of a German federal ministry. The Sankey diagram for cost flows they show reminded me of a feature in the Umberto material flow management software, which I always wanted to inspect in more detail.

Using their 30-day trial version I worked with one of the simple demo examples they provide. Basically this software is a modeling tool for process systems and analysis of material flows within any kind of process system (production plant, supply chain, region, …). Sankey diagrams in Umberto are not the default view for material flows, but one can switch from the normal “Material Flow Network” view to the Sankey view.

Even though the Sankey diagram feature of the software would need some retouching, I was surprised and extremely pleased to see a “Cost Sankey” feature.

You can enter material direct cost for all materials (in the ‘bucket factory’ example of the demo all materials already have a “market price” property), as well as fixed and variable process costs. The variable process costs are spread over the process throughput using ‘machine hours’ or ‘work hours’ as cost drivers (i.e. to link cost creation to the material throughput). Thus, at every process (shown with blue squares in the flow diagram) the costs -or should I say: the value - increases. Going from left to right along the general flow direction in the Sankey diagram you can see clearly that the growing magnitude of the Sankey cost flows… a kind of ‘Value Added Sankey diagram’.

Cost Flow Sankey Diagram 1 - overall cost, increasing line width shows value added (screenshot from from Umberto demo version)Cost Flow Sankey Diagram 2 - cost per costing unit view, cost per (screenshot from from Umberto demo version)Cost Flow Sankey Diagram 3 - cost for individual product handled only in PE and watering can production (screenshot from from Umberto demo version)Cost Flow Sankey Diagram 4 - cost for PP based product, the production line PE is not used to create this product, hence not adding to costs (screenshot from Umberto demo version)

The above screenshots show the overall cost for the three products produced in the bucket factory (Fig.1), the cost per unit for each of the three products of the bucket factory (Fig.2).

The following two cost flow Sankey diagrams are for the individual costing units ‘plastic bucket’ and ‘watering can’ (Fig.3 and 4). Please note that on theses diagrams a part of the machines is not being used, so they don’t add any process costs to the costing unit (or don’t contribute to the value added). Unfortunately you can only display either mass or energy flows in one Sankey diagram, so the energy costs (from the circle labeled ‘other materials’) are not shown as a Sankey flow, even though they add to the price for each product.

The website of Nottingham City Schools offers a variety of materials that can be used by teachers in their courses. One of the key areas in the science field is ‘energy’.

The site has a demonstration of how Sankey diagrams may be used to represent transfer of energy, including a PowerPoint and “stories”, for which pupils can create a Sankey diagram by using tokens cut from cardboard.

Making Sankey diagrams with scissors and cardboard to show energy issues: Teacher materials from Nottingham City Schools.

I think this is a great idea, as it supports the understanding of the energy topic with a haptic and, very importantly, a visual approach.