Padding, Null/Missing Values
When We Need to Pad Data Outside Tableau
Some cases when we need to pad data “outside” Tableau, either in the data connection or underlying queries.
- When doing future projections, in order to get future dates.
- When combining two data sources that are both incomplete. An example is the basic falls view, where we want to look at fall rates and # of falls by unit over time. Some units may have no falls in the time period being reviewed, so we need to pad for the unit so that is always displayed. In addition, if there are no falls for the given time period for the unit, then Tableau won’t display headers for discrete dates for the unit, so we need to pad for the date. In addition, there might not be falls in a given unit when we don’t yet have the patient days (denominator for the fall rate), so we need to be sure to pad the date to get to the current date.
- In order to get lines to draw properly in time-series data. (see a discussion link somewhere here)
More on padding data from Joe Mako – using Custom SQL to pad data:
Most everything else in this section is about padding data in Tableau…
ZN(LOOKUP(SUM(IF [fieldname] = [value] THEN 1 ELSE 0 END),0))
Padding event information to get lines to draw
Using ZN(LOOKUP()) to pad to get an area chart to draw:
Use ZN() function to fill in zeroes (for example, to get the ref lines to show up) http://www.tableausoftware.com/support/forum/topic/filling-zero-blank-values
Working with Null/Missing Values
Post on how to create an average that includes setting missing days as 0:
Getting from an “empty” cell to data:
ZN(), IFNULL(), etc. work when there is a Null value. When there is no data, LOOKUP(agg([field]),0) returns Null when there is no data and that can be wrapped in the ZN, etc. Alternatively the Format special values text can be used.
When working with blends where the secondary datasource has no corresponding rows, can use IFNULL(ATTR([secondary field])) as a calc in the primary to determine if there’s data returned, for example the following to return 0 for no value else return a value:
IF ISNULL(ATTR([secondary datasource field])) THEN 0 ELSE SUM([secondary datasourcefield]) END
more on padding data (getting Null values so Tableau will draw lines correctly):
More on padding (at least for Tableau 6), from http://www.tableausoftware.com/support/forum/topic/trellis-plots comments by Joe Mako:
When dimensions are on Rows & Columns shelf, Tableau pads the domain. When one dimension is on the columns shelf, and one on rows shelf, tableau doesn’t pad the domain.
Having both dimensions used for addressing (right-side list box in Advanced…->Compute using) effectively causes Tableau to pad the domain.
Notes from Mar 2012 WebEx w/Joe Mako
Joe notes there’s some strangeness when there’s there are distinct groups, i.e. a one-to-many hierarchy like category/sub-category in superstore sales. Joe walked me through an example: Put Sub-category on columns, SUM(Profit) on rows, Category on LoD. Have 17 marks in a Bar chart. Change mark type to line, suddenly there are 51 marks.
Also, show missing values can do weird stuff in Tableau 7.0.
For dates, use DATEPART/DATETRUNC instead of Tableau’s options because Tableau’s options are harder to use for table calcs.
If a combination of dimensions doesn’t exist, Tableau can get weird fast with padding.
Example w/SuperStore Sales:
Put Customer on Columns, Container & Category on Rows. See blank cells, and Abcs, with 4514 marks. Create Index() calculated field, and put that on the Text shelf. Suddenly there are 15900 marks.
also, see table calc padding.twbx workbook
From Joe Mako post http://community.tableausoftware.com/message/186882#186882
– Arrangement of dimension pills on shelves, Rows, Columns, and shelf on the Marks card, eg making a crosstab
– Using a date pill for the compute using
– Show Missing Values for discrete date pills
– Your Mark type, Line and Polygon mark types will pad more with a dimension on the Level of Detail shelf, depending on your data contents.
Each combination here has different results and affects padding differently. Another factor is how is padded marks can be detected and handled either in filtering or in formulas, because the concept of “Missing” is not equal to Null. The order of operations needs to be considered.
Use discrete measures on Rows & Columns to prevent unwanted densification, post from Joe Mako: http://community.tableausoftware.com/message/223044
[loop category=”wikicontent” tag=”padding”]
[field title] – Added [field date]
[loop tag=”padding,densification,domain-completion,domain-padding,missing-values,show-empty-columns,show-empty-rows,show-missing-values,null,empty” exclude=”this” relation=”and” compare=”not” taxonomy=”category” value=”wikicontent”]
- [field title-link][field thumbnail-link]