Padding, Null/Missing Values
When We Need to Pad Data Outside Tableau
Some cases when we need to pad data “outside” Tableau, either in the data connection or underlying queries.
- When doing future projections, in order to get future dates.
- When combining two data sources that are both incomplete. An example is the basic falls view, where we want to look at fall rates and # of falls by unit over time. Some units may have no falls in the time period being reviewed, so we need to pad for the unit so that is always displayed. In addition, if there are no falls for the given time period for the unit, then Tableau won’t display headers for discrete dates for the unit, so we need to pad for the date. In addition, there might not be falls in a given unit when we don’t yet have the patient days (denominator for the fall rate), so we need to be sure to pad the date to get to the current date.
- In order to get lines to draw properly in time-series data. (see a discussion link somewhere here)
More on padding data from Joe Mako – using Custom SQL to pad data:
Most everything else in this section is about padding data in Tableau…
ZN(LOOKUP(SUM(IF [fieldname] = [value] THEN 1 ELSE 0 END),0))
Padding event information to get lines to draw
Using ZN(LOOKUP()) to pad to get an area chart to draw:
Use ZN() function to fill in zeroes (for example, to get the ref lines to show up) http://www.tableausoftware.com/support/forum/topic/filling-zero-blank-values
Working with Null/Missing Values
Post on how to create an average that includes setting missing days as 0:
Getting from an “empty” cell to data:
ZN(), IFNULL(), etc. work when there is a Null value. When there is no data, LOOKUP(agg([field]),0) returns Null when there is no data and that can be wrapped in the ZN, etc. Alternatively the Format special values text can be used.
When working with blends where the secondary datasource has no corresponding rows, can use IFNULL(ATTR([secondary field])) as a calc in the primary to determine if there’s data returned, for example the following to return 0 for no value else return a value:
IF ISNULL(ATTR([secondary datasource field])) THEN 0 ELSE SUM([secondary datasourcefield]) END
more on padding data (getting Null values so Tableau will draw lines correctly):
More on padding (at least for Tableau 6), from http://www.tableausoftware.com/support/forum/topic/trellis-plots comments by Joe Mako:
When dimensions are on Rows & Columns shelf, Tableau pads the domain. When one dimension is on the columns shelf, and one on rows shelf, tableau doesn’t pad the domain.
Having both dimensions used for addressing (right-side list box in Advanced…->Compute using) effectively causes Tableau to pad the domain.
Notes from Mar 2012 WebEx w/Joe Mako
Joe notes there’s some strangeness when there’s there are distinct groups, i.e. a one-to-many hierarchy like category/sub-category in superstore sales. Joe walked me through an example: Put Sub-category on columns, SUM(Profit) on rows, Category on LoD. Have 17 marks in a Bar chart. Change mark type to line, suddenly there are 51 marks.
Also, show missing values can do weird stuff in Tableau 7.0.
For dates, use DATEPART/DATETRUNC instead of Tableau’s options because Tableau’s options are harder to use for table calcs.
If a combination of dimensions doesn’t exist, Tableau can get weird fast with padding.
Example w/SuperStore Sales:
Put Customer on Columns, Container & Category on Rows. See blank cells, and Abcs, with 4514 marks. Create Index() calculated field, and put that on the Text shelf. Suddenly there are 15900 marks.
also, see table calc padding.twbx workbook
From Joe Mako post http://community.tableausoftware.com/message/186882#186882
– Arrangement of dimension pills on shelves, Rows, Columns, and shelf on the Marks card, eg making a crosstab
– Using a date pill for the compute using
– Show Missing Values for discrete date pills
– Your Mark type, Line and Polygon mark types will pad more with a dimension on the Level of Detail shelf, depending on your data contents.
Each combination here has different results and affects padding differently. Another factor is how is padded marks can be detected and handled either in filtering or in formulas, because the concept of “Missing” is not equal to Null. The order of operations needs to be considered.
Use discrete measures on Rows & Columns to prevent unwanted densification, post from Joe Mako: http://community.tableausoftware.com/message/223044
[loop category=”wikicontent” tag=”padding”]
[field title] – Added [field date]
[loop tag=”padding,densification,domain-completion,domain-padding,missing-values,show-empty-columns,show-empty-rows,show-missing-values,null,empty” exclude=”this” relation=”and” compare=”not” taxonomy=”category” value=”wikicontent”]
- [field title-link][field thumbnail-link]
Once again, you come to my rescue! I was doing exactly as you describe in the first paragraph – falls data – and needed to pad zeros. The Lookup/ZN function worked like a charm. Thank you!
Glad it helped!
Jonathan – thought you might be interested (and know why – I think it is something to do with the calculation precedence in Tableau?) that the padding with zeroes stops working when you also combine it with a ‘late filter’ (e.g., filtering on lookup(attr([date]),0) rather than [date]). I can share a workbook with you to show the details if you want to take a look at the underlying info.
Sorry for the late response, I just found this in the spam folder. You’re right, it does have to do with Tableau’s order of operations, here’s what happens:
1) Tableau queries the data source and returns record-level and regular aggregate results (which are sparse).
2) ATTR([date]) is computed and only returns results where there are dates in the data.
3) Tableau pads the data to fill in the missing dates.
4) Table calculations are computed, so 0s are filled in for the measures *and* LOOKUP(ATTR([date]),0) returns Null.
5) The filter using LOOKUP(ATTR([date]),0) that only picks certain dates isn’t also selecting for Null so it filters out the padded dates.
So to be clear, the data is getting padded and the filter is removing the padded data. The workaround for this is to create a table calculation that can have the values for the padded out dates, something like DATEADD(‘day’, INDEX()-1, WINDOW_MIN(MIN([Date]))) and use that instead of the LOOKUP() filter.
Does that make sense?
I just saw your post and I am running into the same problem as Johan it seems like.
I am getting data from a SQL View that has brake_event_count column which is just a “1” for every record in the table, so I can just do a sum on that column, and it returns the total number of brake events. The problem with this, is that when i organize data by week and cause, I get blanks for the date and cause where there is no brake count, and it makes sense because there is not data, so it is not even a null value.
I am using this formula to get the last 7 days starting from yesterday using the top filter with formula:
IF date([Meas Dt]) (date(today()-1))
THEN NULL ELSE [Meas Dt] END
Then I have a calculated field with this formula to assign nulls and replace with zeros:
ZN(LOOKUP(SUM([Ptcr Brkevt Cnt]),0))
My client wants yesterday to be color coded in blue and the day before yesterday to be color coded red, so I crated another calculated field called color code.
IF attr([Meas Dt]) = (date(today() -1)) Then “Yesterday”
Elseif attr([Meas Dt]) = (date(today()-2)) Then “2 Days Ago”
Else “Rest of Week” End
The problem is when I drop the color code calculated field in the Color Mark, it colors the zeros in the last two days as “rest of the week”
Is there any way for tableau to recognize the zeros that are in “yesterdays” date and “2 days ago”?
I would really appreciate some insight,
thanks a lot
The problem is that the computation ATTR([Meas Dt]) is ultimately evaluated in the data source *before* densification occurs, so it is returning nothing and then that gets padded to Null by the desnification, then in the IF statement the Null value falls into the “Rest of Week” bucket. There are a couple of common ways around this:
– use a “scaffold” source that has all the dates and all causes as the primary source in a data blend. That way you have all the date dimensionality you need and don’t need the table LOOKUP() calculation.
– use a table calculation date instead of [Meas Dt] so it will be padded out, the usual calc is something like DATEADD(‘day’,INDEX(), TOTAL(MIN([Meas Dt]))) but without more details I’m not sure if that would work exactly, you might need to have separate INDEX() and TOTAL() calcs with their own compute usings.
Can anyone help me to solve my problem??
I have bank cash flow data in tables, and I have prepared crosstab report for every month cash transactions.( It includes opening balance, gross balance and closing balance for each month).
The problem is if no transactions done in any month, nothing is displaying when I select that particular month only.
I want to get same tabular format to display with opening balance value(closing balance of previous month) and gross balance 0 value (as present month don’t have any data) and closing balance (opening balance +0) .
Tableau isn’t the greatest tool for doing this sort of view, this is a place where a tool like Excel can be more appropriate. At the least you’ll need to:
a) created a padded month with a formula like DATEADD(‘month’, INDEX()-1, WINDOW_MIN(MIN([Month]))) with the appropriate compute using setting (usually on all dimensions)
b) use that padded month as your filter instead of your existing month dimension
Beyond that I’m not sure what you might need without knowing a lot more about the structure of your data and the desired goal, and that would take more time. If you do need more help I have two suggestions: One would be to use the Tableau forums at http://community.tableau.com, the other would be to gets some HelpMeDataBlick time http://datablick.com/services/ with me or someone else from my company.