Category Archives: Unexpected Results

When 576 = 567 = 528 = 456: Counting Marks

Tableau’s data densification is like…nothing else I’ve ever used. It’s a feature that is totally brilliant when it “just works” like automatically building out a running sum on sparse data and mind-taxingly complicated when a data blend’s results go haywire because densification was accidentally triggered.

What I’ve historically taught users is to always ALWAYS look at the marks count in the status bar as a first way to detect when data densification occurs. Here’s Superstore Sales data with MONTH(Order Date) on Columns, Region and State on Rows, there are 499 marks and we can see that the data is sparse by the class that are missing Abcs:

Screen Shot 2016-08-16 at 11.52.15 PM

If I add SUM(Sales) to the Level of Detail Shelf and set it to a Running Total Quick Table Calculation with the default Compute Using of Table (Across) so it’s addressing on Order Date then I see 576 marks and all the Abcs are filled in, this is Tableau’s data densification at work:

Screen Shot 2016-08-16 at 11.55.19 PM

However, here are three additional views all still using the same pill layout and Quick Table Calculations  showing three different marks counts (567, 528, and 456):

Screen Shot 2016-08-16 at 11.59.11 PMScreen Shot 2016-08-17 at 12.00.55 AM

The marks count is changing based on a variety of factors, the different quick table calculations used (running total, difference, and percent difference) are a part of it but the underlying behavior depends on whether a mark is densified or not, the pill arrangement, and whether or not a densified mark has been assigned a value (including Null) or not. Prior to Tableau version 9.0 these all would have been counted in the marks count and the views would show 576 marks for each, Tableau v9.0 changed the behavior to only count the “visible” marks.

I’ll walk through the above there examples. In this one the Running Total has been moved from the Level of Detail to the Rows Shelf and there are 567 marks.

Screen Shot 2016-08-16 at 11.59.11 PM

The reason why is that even though those combinations of Region, State, and Month have been densified for states like Iowa that don’t have any sales in the first month(s) of the year (more on how I know that below) those densified marks don’t have any assigned value (even Null) so they are not counted in the marks count nor are they counted in the Special Values indicator in the lower right.

In this view using the Difference calculation there are 528 marks and the Special Values indicator shows 48 nulls (528+48 = 576). In this case the Difference calculation is using the LOOKUP() function that is returning Null for the densified values.

Screen Shot 2016-08-16 at 11.59.11 PM

Finally in this view using the % Difference calculation there are 456 marks and the Special Values indicator shows 120 nulls (456+120 = 576). In this case the % difference calculation is spitting out extra nulls due to divide by 0 results.

Screen Shot 2016-08-16 at 11.59.11 PM

The difference is due to a change made in Tableau v9.0 where the marks count now only counts “visible” marks (Tableau’s term), where the definition of a “visible” mark is complicated, they are the “Yes” answers in the table below:

Screen Shot 2016-08-17 at 12.09.17 AM

Now one of the ways I’ve been used to checking for densification is selecting all the marks (either by Right+Clicking and choosing Select All or pressing Ctrl/Cmd+A) and then hovering over a mark and Right+Clicking and choosing View Data… or waiting for the tooltip to come up and using View Data. For example here’s the select all View Data in v9.0 for the % Difference on Rows view, the yellow cells indicate where data was densified and there are 576 rows:

Screen Shot 2016-08-17 at 12.12.24 AM

However, that doesn’t work anymore in Tableau v10.0, there was change made to the Select All functionality such that Select All only gets the “visible” marks, here’s that same view data in v10 and there are only 456 rows:

Screen Shot 2016-08-17 at 12.12.58 AM

So Select All doesn’t work the way it used to, and the marks count can change in “interesting ways” (and we haven’t gone into what things like formatting Special Values can do), so what can we do to spot densification? There are three workarounds for this, all documented in the right-most column of the table above:

  1. Select a discrete header or a range of headers, wait for the tooltip to come up, and click on the View Data icon.
  2. Right-click in the view (but not on a mark) and choose View Data…
  3. Use the Analyis->View Data… menu option.

All of these will show the densified values, here’s an animated GIF of selecting Iowa selected in the Difference on Rows view where we can see the  two Null values:

2016-08-17 00_21_03

However only one of those is actually densified, to tell that exactly we need to add a field that actually has data. In this case I’ve added SUM(Sales) to the Level of Detail Shelf and the View Data for Iowa now shows that it’s really only January that is densified, since there’s nothing at all in the January SUM(Sales) cell:

Screen Shot 2016-08-17 at 12.27.28 AM

Conclusion

The marks count is not a reliable indicator of the volume of densification and we need to resort to various selection mechanisms and the View Data dialog to more specifically identify how much has been densified. I’m not a fan of these changes: what I’d really like Tableau to do is to add a count of densified values to the status bar and details on what was densified to the default caption and the Worksheet->Describe Sheet… Until that time, though, hopefully this post will help you keep track of what Tableau is doing!

Here’s a link to the marks count workbook in v8.3 format (so you can open it up for yourself and see the differences in different versions).

At the Level – Unlocking the Mystery Part 2: Rank Functions

Many moons ago I did a first post exploring the non-obvious logic of the most secretive of Tableau table calculation configuration options: At the Level. A few weeks ago I was inspired by a question over email to dive back in, this post explores At the Level for the five rank functions: RANK(), RANK_DENSE(), RANK_MODIFIED(), RANK_UNIQUE(), and RANK_PERCENTILE(). The rank functions add a level of indirection to the already complicated behavior of At the Level and I don’t have any particular use cases, so…

If you are like me and won’t rest until you understand every detail of Tableau’s functionality, then this post is for you. Otherwise you may find this post unhelpful and/or confusing due to extreme table calculation geekery. You have been warned.

The particular challenge with ordinal functions like INDEX(), FIRST(), and the rank functions is that we absolutely have to understand how addressing and partitioning works in Tableau, and then we tack onto that an understanding of how the calculations work, and finally we can add on how At the Level works. For the first part, I suggest you read the Part 1 post on At the Level, it goes into some detail on addressing and partitioning. To understand the rank functions here’s the Tableau manual for table calculations (scroll down to the Rank functions section). Finally, read on for how At the Level works for rank functions.

Continue reading

Version 8 Blending: Version 7 Blending Under the Hood [UPDATED]

An update: Looks like this one is a bug… Tableau guru-to-the-gurus Joe Mako noted in the comments below that this behavior doesn’t occur for strings or numbers. I’d thought I’d seen this with other data types, but I was wrong. I’ve submitted this to Tableau tech support and updated the post, I’ll do another update when I hear back from Tableau.

I’ve got at least a couple more posts in the queue about various features of Tableau version 8 blending. Here’s how to run into one undocumented feature:

  • Date dimension(s) in the primary and secondary have the same name, or a defined relationship(s) in the Relationships window.
  • The date dimension(s) from the primary is/are in the view.
  • The data will blend using those date dimension(s), regardless of whether the link icon is on or off.

Click for a demonstration!

Unexpected Results: Aliases in URL Parameters

Fellow Tableau Zen Master Andy Kriebel writes great tutorials, like this one on passing filters in a URL. I was using those instructions to build URLs to pass from one Tableau workbook to another and things were going swimmingly in trials until I got to my data, where I found not one, but two undocumented features of Tableau’s URL parameters.

Aliases in URL Parameters

When we set up a URL Action in Tableau and add fields to the action, if the field is a Tableau parameter or a discrete dimension that has an alias assigned, when generating the URL parameters Tableau will use the alias and not the original value. So, for example, if your field is an integer such as 201 with a string alias of MS4, Tableau will pass MS4 and not 201, like in the image above. If you have a mix of some aliases and some not, Tableau will use the aliases where they exist.

Tableau Parameters used in URL Parameters Affect Parameters in Target Worksheet

The documentation doesn’t explicitly state that Tableau can use a Tableau parameter in a URL Parameter, but we can. And one of the interesting effects is that if the target of the URL is another Tableau workbook and there is a Tableau parameter of the same name in that workbook, then Tableau will set the value of target’s parameter to the passed value. This is a useful feature for making parameters truly global. The one caveat is the issue above, if the parameter is using an alias then the alias is passed to the target, not the original value of the parameter.

There are a three ways I’ve come up with so far to deal with this:

  1. Stop using aliases and set up the parameter or field with the desired values.
  2. Set up the target to handle the aliases.
  3. Instead of using the parameter or discrete field with the alias as the parameter, use a calculated field that just has [myParameterOrField] as the formula so it will just have the value and not any alias.

I’m using #1. This is a bit of a letdown for me, in reading up on improving performance there are big gains to sticking with numbers and using aliases instead of strings, and having to add extra columns to the data in the case of #3 to avoid this seems to partially defeat the purpose. If you have others, let me know!

Unexpected Results: Rounding

When using Tableau with different data sources, it becomes obvious fairly quickly that there are differences in what functions are available in one data source vs. another. For example, MEDIAN() and COUNTD() are functions not available in MS Excel, Access, or text data sources, but are available in Tableau Data Extracts and many others. This post goes into a case where the same function is available, but is returning different results than we might expect depending on context, and introduces a workaround. Continue reading