Month: August 2010

Largest Sankey Diagram ever?

I was asked what the largest Sankey diagram I have seen so far was? Not sure about this. The Swiss biomass flows diagram I featured in January 2009 would be a candidate, but I think the following one is much larger, both in the actual diagram size, as well as in respect to the number of nodes and arrows.

It was created by the marketing department of RWE AG, one of Germany’s big four electricity providers. It shows the energy flows in Germany for 1995. The image file I have is 2588 x 1062 pixels, so when you click on the image below, you won’t be seeing the full image, but only a thumbnail… (also because I don’t want to get sued by their legal department!)

The clear structuring of the diagram is what might be called “very German”: energy sources are at the left, with non-renewables at the top and renewables at the bottom. The second section shows energy conversion, the third section the energy use sectors. These are further broken down, before the arrow join again to show useful energy and energy losses (a 50:50 relationship). Losses in the conversion phase are shown as arrows that branch out vertically to the bottom. Exports and bunkering, on the other hands, is shown as vertical flow leaving to the top. Color coding is very clear, and flows are labeled with figures on each arrow. The unit is petajoule (PJ)

Even the black background goes well with the bright colours. Here are two close-ups of two sections from this energy flow diagram. Enjoy the beauty…

Sankey Diagram Function in R

I am always getting excited when I have a new e-mail with a Google alert for Sankey diagrams. This time I was directed to this blog post.

SankeyR is a function for the R open source statistical computing and graphics package based on the drawSankey routine for Matlab developed at EPFL. Aaron Berdanier at Colorado State adapted it to work with R. It produces simple left-to-right Sankey diagrams like this one:

The routine creates a plot in R, or can be outputed to bmp or pdf format. “Inputs do not need to equal losses. Any difference will be displayed as a discrepancy in the height of the left and right sides of the diagram. This capability enables the developer to examine imbalances in flows.”

Get the source from Aaron’s blog.

I have added SankeyR to the Sankey software list. Thanks Aaron for sharing this with the R community. I am just not sure how what to make of the blog name…