I recently got a request for help with a wrapping challenge. Not holiday presents, instead text wrapping in Tableau. Here’s a demo view illustrating the problem:

There’s a whole bunch of ellipses (…) where there’s too much data to display. The only options Tableau gives us in this case are to:

Change the order of the text by sorting the Product Name dimension.
Manually change the size of each pane by resizing the headers.
Use the Fit options to dynamically Fit Width, Fit Height, or Entire View.

The manual sizing is problematic because it won’t dynamically adjust to the number of marks, and in the case of views with lots of marks like this one it takes a lot of effort to figure out what size will get all the marks, never mind that the list of values is really hard to read:

And while the Fit options are great at ensuring a view with only a few marks takes up the available space when there are many marks it ends up either not displaying values or creating overlapping values depending on the settings.

Controlling X + Y

In this view the mark layout–either no pills or only discrete (blue) pills on Rows and Columns– is generating a pane for each distinct combination of header values and then Tableau’s mark stacking algorithm is laying out the text. So at this point we’re stuck and can’t do anything about what Tableau is up to. This is where we need to keep in mind one of the master Tableau concepts: everything is a scatterplot. If Tableau won’t place the marks where we want them then we can generate our own X + Y coordinates whether by adding data or creating our own calculations. This is the approach taken by the tile maps introduced to Tableau by Brittany Fong or or as a model of the solar system that I made awhile back: . More recently Ken Flerlage did a great introduction on his Beyond Show Me series of blog posts.

Therefore “all” we need to do is figure out where to place the marks. More on that in a moment, there are two more details I want to go into:

Green Panes vs. Blue Panes: Pane Sizing

Tableau’s logic around what generates a pane vs. marks in a pane is a little complicated so I’m going to keep this focused on three key elements, here are the first two:

All panes created by a given combination of pills are the same size.
(corollary to #1) If we resize one header or axis then all the other panes for that header or axis will resize as well. Tableau does this because it’s easier to visually parse (read) a view that has consistent sizing of elements.

Here’s a view with COUNTD(Product Names) on Text & Color with just discrete (blue) pills on Rows and Columns:

Somehow we need to fit 509 Product Names into the pane for Q4 2015/Office Supplies. If I resize Office Supplies to be taller then both Furniture & Technology change as well:

The same goes if I’ve got a continuous axis to place X/Y coordinates on. In this view I’ve simply put MIN(0) and MIN(1) on Columns & MIN(0) Rows and we can see a set of axes:

If I resize MIN(1) on Columns then to make it wider then all of the panes for MIN(0) and MIN(1) on Columns are resized.

So we can’t really dynamically resize panes to fit the data, all we can do is fit more or less into a pane. Therefore the desired solution can’t involve resizing panes, instead we will need to be generating more or fewer panes, and that leads to the next point around panes.

Green Panes vs. Blue Panes: Number of Panes

The third key element around green panes and blue panes is this:

a) Continuous pills generate an axis for every discrete pill header. b) Discrete pills generate a header for every value of the pill.

We can see a) in action in the continuous views above, with MIN(0) and MIN(1) on Columns we get two axes for each quarter/year combination. So to add more axes we’d need to add more continuous pills but we can’t dynamically add them, and the number of axes ultimately depends on the discrete pills anyway so discrete is the way to go.

We can see b) in the discrete views above, there’s a header for each quarter in each year. Where this gets a little more interesting (and more useful in our case) is when the data is sparse, as in this case where the Avery products are not sold in every customer Segment:

Avery 5 is only sold in one segment so there is only a single header for Consumer, whereas Avery 494 is sold in all three segments so there’s a header for each.

So how this comes together is that in reating X/Y coordinates for positioning the text in our desired view we’re going to use discrete headers that can give us just enough headers (and no more) for the task, here’s a pic of the desired view with those headers:

Packing the Marks: the Algorithm

I experimented with some different layouts and looked at the following factors:

In each pane there’s a list of 0 or more values (marks).
At least in English when we’re reading lists we tend to make them top to bottom and when more is needed we add another column to the right.
There’s a balance in readability between too many columns vs. too tall columns. When there are many columns already then adding more columns for the list makes the view harder to read; in other words, a “tall” view with fewer columns is easier to read than a “wide” view.
When the panes in a row or panes in a column have different numbers of marks it’s important to efficiently stack the marks: too much white space can make the view harder to read.
A stacking layout that is closer to a differently-sized squares is easier to read than one that ends up with differently-sized rectangles.

The algorithm I came up with is a variation on the panel chart layout I used in Waffle Charts and Unit Charts on Maps that uses table calculations. The algorithm does the following:

Calculates the index for each mark in a pane using INDEX() and the number of marks in a pane using SIZE(). These calculations are used in the following calculations.
Counts the number of mark columns needed for each pane where there’s a Max # of Mark Columns. parameter to set a “no more than” value to prevent views from getting too wide. Then a nested calculation counts the maximum number of mark columns in each column.
Once we have the number of mark columns then the algorithm computes the number of mark rows for each pane, and then gets the maximum number of mark rows for each row.
Finally the mark row position and mark column position can be computed based on the index for each mark in the pane and the available number of rows and columns.

I numbered the calculations so they can be each brought into a workout view in order with their compute using set and validated before moving to the next calc. Calcs 1 & 2 require a compute using on the dimension to be used on Text and Calcs 4 & 6 have nested compute usings, see the comments on the calcs for details.

Here’s the workout view:

One complication is that the date dimensions are on Detail with custom dates with the ATTR() aggregation on Rows. This is a method to prevent unwanted data densification.

Once the workout view is built and validated then it’s possible to duplicate the view and rearrange pills, here’s that view:

There’s still a bit of manual resizing required, in this case it’s just to have enough size in each of the panes created by the column and row position table calculations to display the text. Once that is done those headers can be hidden for the final view:

We’re not limited to a text display, for example here’s a highlight table that only took a couple more clicks:

Conclusion

Here’s a view to play with where you can adjust the Max # of Columns parameter and the number of states (which is a proxy for how many products are displayed). Click the image to open the text wrapping in pane view on Tableau Public:
The key concept to keep in mind is that when Tableau won’t plot marks where we want we can add to the data source to get the necessary X&Y coordinates via joins, blends, and/or writing calculations. Since Tableau was designed as a tool to support interactive visual analytics tasks like making giant text tables with the desired text wrapping can take more effort than we might like, however given Tableau’s flexibility we can get the job done.

One of the people I really wanted to meet at #data17 was Sarah Battersby, if you haven’t seen her blogs on mapping you’re missing out. It turns out she wanted to meet me, too!

Awesome to meet a data idol! #MeAndMyZen pic.twitter.com/nKHr8yGJxt

— Sarah Battersby (@mapsOverlord) October 10, 2017

So Sarah does really awesome stuff (can you tell I’m a fanboy yet??) and yesterday she posted about doing spiral jittering on maps:

That inspired me to dig out an old post where I’d done a waffle chart or unit chart jittering technique on a time series:

And adapt that to a map, like this one that makes me think of bunches of grapes:

Why would we need to do this?

I’m not going to go into all the details, you can go read Sarah’s fantastic Jittering post instead for an overview. Basically there are times when we have more than one “thing” for one or more point(s) on a map and we want to show them that way. For example in the map below there are 5 IDs for the selected zip code that are all plotted on top of one another. Through using jittering we can show a mark for each ID like in the jittered map above where that one visible circle is replaced by five circles.

There is nothing new under the sun

Back in 2013 Stephen Few proposed the idea of bricks on maps as a way of encoding a value, the technique I’m using here could be used to get the same result. The key difference is that instead of using the unit or waffle chart to encode the single value we’re using the unit chart to encode multiple values. Andy Cotgreave also had a nice post on some of the issues with this kind of chart.

How do we put a waffle chart on a map?

First of all we need a data source that has latitudes & longitudes that we can do calculations on. If you don’t there are a variety of ways to get one. After that we need to get mathy. Here’s my workout view:

All of the table calculations have a compute using on ID, the idea is that the compute using is set to the dimension(s) that are making the extra marks and then the dimension(s) left for partitioning (Zip Code in this case) are the ones that are creating the initial map positions.

I’ll briefly walk through each calculation.

Index uses Tableau’s INDEX() formula and uniquely identifies each mark.

Column Count gets us a count of columns that is based on forming the smallest possible square that will contain all the marks, here’s the formula:

//this formula will try to draw a square, if you want something else then just put in a fixed aggregate number like MIN(3)
IF INT(SQRT(SIZE())) == SQRT(SIZE()) THEN
 SQRT(SIZE())
ELSE
 INT(SQRT(SIZE()))+1
END

This would be a lot simpler if Tableau supported the CEILING() function for table calculations, because then it would be CEILING(SQRT(SIZE())). Vote up https://community.tableau.com/ideas/6239 if you’d like that too.

Column Offset then computes the number of positions of lateral (latitude) offset for each mark from the center of the unit chart. I’ve done some different experiments on where to position the unit chart and for me the most natural is to have the center of the unit chart be the centroid of the point.

//this gets the column position in the row
IF [Index] % [Column Count] = 0 THEN
 [Column Count]
ELSE
 [Index] % [Column Count]
END
//subtract the position of the midline to get the offset
 - ([Column Count] + 1)/2

Row Count then gets the number of rows in the unit chart, and would again benefit from a CEILING() function for table calcs (hint hint).

IF INT(SIZE()/[Column Count]) = SIZE()/[Column Count] THEN
 SIZE()/[Column Count]
ELSE
 INT(SIZE()/[Column Count]) + 1
END

Row Offset computes the number of positions of vertical (longitude) offset for each mark from the center of the unit chart:

//row position in column
IF INT([Index]/[Column Count]) = [Index]/[Column Count] THEN
 [Index]/[Column Count]
ELSE
 INT([Index]/[Column Count]) + 1
END
//subtract the midline of the row to get the offset
- ([Row Count] + 1)/2

Jitter – Waffle – Latitude then gets the latitude for the point and subtracts the row offset multiplied by a waffle jitter parameter that lets us deal with the zoom level to offset by just the right amount of latitude. This field has the Latitude geographic role assigned.

TOTAL(AVG([Latitude])) - [Row Offset] * [waffle jitter]

Jitter – Waffle – Longitude does the same for the longitude.

TOTAL(AVG([Longitude])) + [Column Offset] * [waffle jitter]

Note that I’ve only tested this so far with US latitude & longitudes, these calcs might need to be slightly different if you are east of 0 longitude or south of the equator.

Building the View

With all the calcs created in the workout sheet and verified, then we can duplicate that sheet and start dragging & dropping pills to build a view. In this case I’ve created a dual axis on to show that the centers of the unit chart are on the centroids of the zip codes (the black crosses) and at this zoom level a waffle jitter value of 0.02 works well:

For this view I’ve made the waffle jitter slightly smaller to 0.015 degrees and added white borders to the marks:

Why Stop Here?

We don’t have to use just circle marks, in this view I’ve used shape marks and different colors for each shape just to show how much information we could place on a single view.

And in Tableau v10.5 we could even do something like only show the unit chart on a map if there weren’t too many marks nearby (by using a binning or hex bin function in an LOD expression) and use the new Viz in Tooltip feature to get the detail.

Conclusion

Thanks to Alan Eldridge for the original jittering post and Sarah for adding more jittering options and inspiring this post! Here’s the unit or waffle chart jitter on maps workbook on Tableau Public.

Drawing with Numbers

Thoughts on data visualization and Tableau

Tag Archives: waffle

Text Wrapping within Tableau Panes

Controlling X + Y

Green Panes vs. Blue Panes: Pane Sizing

Green Panes vs. Blue Panes: Number of Panes

Packing the Marks: the Algorithm

Conclusion

Waffle Charts and Unit Charts on Maps

Why would we need to do this?

There is nothing new under the sun

How do we put a waffle chart on a map?

Building the View

Why Stop Here?

Conclusion