Found an interesting blog post over at TabVizExplorer, a casual blog maintained by Mithun Desai. Data visualization is one focus of his work.
He uses Tableau to draw Sankey charts (I prefer to call them relationship diagrams, alluvial diagrams or even Spaghetti diagrams). Here is a rather simple one, showing the relation between top 20 cricket players and their country of origin.
The diagram has two data categories. The country of origin shown in the left stacked column in no particular order, and the top 20 players ordered according to their ICC ranking score.
In between are the streams or bands (or ‘Spaghettis’ for the sake of it) color coded by country of origin.
Now, it is not up to me criticizing the choice of diagram type for conveying this specific information. The author seems to have chosen the cricket topic just as a sample, to explain how to do Sankey charts in Tableau in general. Actually the colored list of top 20 (right column) already tells us all we need to know and you wouldn’t even need the left column and the streams.
The main reason I am not happy with this diagram is the fact that it does not stick to the most important characteristic of a Sankey diagram. The post itself comes with the definition: “Sankey diagrams are specific type of flow diagram in which the width of the arrows is shown proportionally to the flow quantity.”
So, what is the flow quantity here? I was thinking of net worth in $$$ of each player, or at least a translation of the ranking score to the width of the bands. But then Babar Azam, who ranked 4th with a score of 846 wouldn’t be shown with a band narrower than the one of E.J.G.Morgan coming in 20th with a score of 650. My guess is, that the the widths of the streams are chosen deliberately…
Where the bands merge, they overlap rather than merge to show the sum of the flow quantities. This makes for a very odd visual effect, at least in terms of Sankey diagrams.
The blog article gives away some of the math behind the curves, so called Sigmoid curves, which is interesting.
This capture taken from the embedded Tableau graph shows how the curves are made up and how the width is maintained along the routing of each curve: You do it with cricket balls 😉 … or christmas bulbs.
Other implementations of relationship diagrams use Beziers curves (which sometimes come with another downside, read here). But that’s for another time…