Tag Archives: aggregation

LOD Expression Remix – Finding a Dimension at a Lower Level

Last week Mark Jackson had a great post on using Tableau v9 Level of Detail expressions to find a dimension at a lower level (with an update here). In his Superstore example where there are multiple Categories in each State, the goal is to show a view of each State with the largest Category in that State based on the number of customers, like this:

2015-05-27 10_43_52-Tableau - return lower level dim from LODIn this remix post I’ll demonstrate an alternative solution that doesn’t require any string manipulation, along with going through my current process for building & verifying LOD expressions.

[Post edited 20140527 to include links to Mark’s update to his original post.]

Continue reading

How to Have Sets with Your Secondary (Data Sources)

Tableau 9.2 adds the ability to use boolean dimensions directly from secondary sources in the view and on the Filters Shelf so it simplifies this setup somewhat. Read How to Have Sets with your Secondary (9.2 Style) with Aggregated Booleans for details.

Tableau Sets can turn incredibly complicated interactions into a few mouse clicks that can reveal patterns in your data. However, if you’ve ever created a Set in a secondary data source and tried to use it across a blend in a  primary, this is what you see:

2013-10-11 00_02_28-Tableau - sets from secondary

Lots of grayed out items that can’t be touched. It’s like you’re at a party and have just met the hot-babe-of-appropriate-gender-of-your-dreams and they are totally into you, but only at the party, and not back at your place.

However, like many things in Tableau, with a little creativity you can get what you need, and there is a way where you can have your Sets anywhere you want.

If you’re not familiar with sets, then I suggest you start with Hot. Dirty. Sets. That was the title of a great session on sets by Russell Christopher and Michael Kravec at the Tableau Customer Conference in DC last month, the session video is up at http://tcc13.tableauconference.com/sessions. In their presentation Russell and Michael plumbed the depths of bad puns and went through a number of use cases. If you’re not familiar with Tableau version 8 sets, definitely watch the video. If you didn’t got to TCC, I suggest you watch this (vanilla) Tableau On-Demand training video on sets.

For this post I’ll attempt to avoid any more jokes about sets and show you how you can get jiggy with use sets from blended secondary data sources, and probably learn a bit more about data blending on the way.

In this tutorial I’ll be using the Superstore Sales and Coffee Chain data that ship with Tableau, with the Superstore Sales as the primary, Coffee Chain as the secondary. The Coffee Chain data is an Access database, and Tableau uses Microsoft JET for the driver for Access, Excel, and text files (this will change somewhat in 8.2). Microsoft JET has more limited functionality than other database connectors, so the first step is to extract that data into Tableau’s data engine. That will enable us to use the IN/OUT of sets.

Then in the Coffee Chain database we create a set for the Top N States by Sum of Sales:2013-10-10 23_50_20-Tableau - sets from secondary

And here’s a view of that Coffee Chain data, where there are only 20 states:2013-10-10 23_51_46-Tableau - sets from secondary

But when we set up a view based on Superstore Sales and switch over to the Coffee Chain data, any and all sets are taboo greyed out:2013-10-11 00_02_28-Tableau - sets from secondary

However, we can use sets in calculated fields. The In/Out of a Set returns a boolean value, and here’s a simple formula used for the In Top N CC States calculated field:

IIF([Top N CC States],"In","Out")

This is treated as a dimension in the CoffeeChain data, which can lead to all sorts of applications, including using as a filter, color, or in additional calculated fields.

Note that in production, I’d probably use 1 and 0 as values instead of “In” and “Out” and then use aliases (because integers are faster than strings), and to solve a filter issue, however for this demo I’ll stick with In and Out to make it easier to understand and explain the filter issue further down.

Set from Secondary as Filter

Here’s a starting view from Superstore Sales as our primary data source:

Screen Shot 2013-10-26 at 11.11.44 AM

Before putting the In Top N CC States on the Filter Shelf, though, in the Coffee Chain data click on State as a linking field:

Screen Shot 2013-10-26 at 10.52.50 AM

Important note: whatever dimension(s) you use for the Set must be the blending dimension(s), no more and no less, if they are not then your results will be mighty strange. I’m still trying to work out what’s happening there. If you really need some calculation that blends on a different set of dimensions, then use a duplicated data connection for that blend.

Now you can drag the In Top N CC States calculated dimension from the Coffee Chain data onto the Filters Shelf, and here’s the filter setting:

Screen Shot 2013-10-26 at 10.52.20 AM

And the view:2013-10-11 00_06_11-Tableau - sets from secondary

The worksheet is now only returning the 5 states from the primary and secondary data sources that meet the filter criteria from the set from the secondary data source. You’ve now used a Set from a secondary source!

Set from Secondary on Color

How about using the set calculation in the view? If we change the In Top CC States to All values, and put a copy of the pill on the Color Shelf then we see three different values: In, Out, and Null.

Why are there Nulls? This is one aspect of data blending that can be confusing. In the Superstore Sales data, there are 49 US states (48 contiguous states plus the District of Columbia). In the Coffee Chain data, there are 20 states. 5 of those are In the Top N CC States set (blue above), 15 of those are Out (light orange). That leaves 29 states in the Superstore Sales data have no corresponding rows in Coffee Chain. For the In Top Top N CC States set calculation, rather than assigning those 29 states to In or Out, Tableau assigns a value of Null to those states because they don’t have corresponding values, and in this case they get a dark orange color. This is the same behavior Tableau has for any linked dimension value from the secondary that doesn’t have corresponding values in the primary.

How can we help those Nulls come Out of the closet become part of the Out of the set?

This assignment of Null values is done in the primary data source, as there’s so way we can change it in the secondary (without doing some sort of padding in the secondary for all of those states). However, there is a way we can do that in the primary data source. Tableau does all the computation it can within each data source, then blends the data sources together, at which point things like calculations that refer to another data source are computed. We can create a calculation in the primary that refers to the set in the secondary, and test whether the set calculation is returning Null. Here’s the formula from Superstore Sales for the Primary In Top N CC States calculated field:

IFNULL(MIN(IIF([Sample - Coffee Chain (Access)].[Top N CC States],"In","Out")),"Out")

The inner IIF() is a row-level calculation that evaluates the In/Out of the Set, then that gets wrapped in MIN() because we’re working across the blend – Tableau requires us to aggregate all measures and dimensions used in calculated fields from other data sources. The IFNULL() then tests the result of the MIN(), and if there’s a Null from one of those 29 states that is in the primary but not the secondary, the IFNULL() Outs that one as well. Here’s a view with the calculation, where now everything is In or Out:2013-10-11 00_27_02-Tableau - sets from secondary

But you don’t have to be monogamous limit yourself to just one set. You can use combined sets from the secondary, for example in this view with CoffeeChain as the Primary I’m showing the top and bottom 5 states from the CoffeeChain data: 2013-10-11 00_35_02-Tableau - sets from secondary

Now to talk about complications and workarounds before getting into some advanced cases:

Discrete vs. Continuous Measures

It’s important to note that the Primary In Top N CC States set calculation is a measure, because it’s using an aggregate from the secondary data source. 2013-10-11 00_38_12-Tableau - sets from secondary

One effect is that Tableau won’t let us filter on discrete (blue) measures like the Primary In Top N CC States:

2014-01-30 17_59_48

My understanding is that this has to do with Tableau needing to know the domain (range of values) of measures to build the filter. (Here’s a Tableau Idea to support this, vote it up!) However, we can filter on continuous measures, so if you want to use that primary filter you can change the calc to return numbers instead of strings, like this calc called Primary In Top N CC States Continuous:

IFNULL(MIN(IIF([Sample - Coffee Chain (Access)].[Top N CC States],1,0)),0)

And this will work fine as a filter:

2014-01-30 18_07_01

Alternatively, you could use a table calc filter such as LOOKUP([Primary In Top N CC States],0) since discrete measures based on table calculations can be used as filters. I prefer the regular aggregate for performance reasons: Table calculation filters are applied after the data has been returned to Tableau, so that can lead to a lot of unnecessary traffic across the wire.

Using In/Out of Secondary Set in Primary Crosstab

In a simple view in Coffee Chain and our Top N set, we can quickly see the Sum of Sales for the In/Out of the set: 2013-10-11 00_46_08-Tableau - sets from secondary

We can even set that up from the Superstore as Primary, using the In Top N CC States set calculation from the Coffee Chain secondary, all we need to do is make sure that State is turned on as a blending field and do a little extra filtering to get rid of those pesky Null values:2013-10-11 00_51_13-Tableau - sets from secondary

 

Advanced Uses for Secondary Sets

You can mix and match Sets from Primary and Secondary sources, here are three examples:

Cohort Analysis

In this view,  we’re looking at trends for profits broken down by the performance reviews of our sales people, looking to see if there are any trends. The sales data comes from Superstore Sales, the performance reviews are coming from a secondary data source. The panes are created by the In/Out of the Top 40 Customers by Profit. The top pane shows the sales for the Top 40 customers, the bottom pane everyone else. The lines are colored by the In/Out of the Top 5 Salespeople:

Screen Shot 2014-01-30 at 6.31.38 PM

 

 

Combining Sets Across Data Sources

To create a combined set, we can use a calculated field that evaluates the sets from the different data sources, like this one that gets the intersection of the Top N States for Sales from both Superstore Sales and Coffee Chain, assuming the Superstore Sales is primary:

Screen Shot 2014-01-30 at 6.36.59 PM

 

And here it is used in a view, with the Top 5 States from each, only 3 states overlap:

Screen Shot 2014-01-30 at 6.37.32 PM

 

The Three Way

Why stop at only two sets from different data sources? In this view using Coffee Chain as primary, we’re coloring the states based on the In/Out of the Top N States by Coffee Chain Sales, the Top N States with highest # of Starbucks per capita (data from Statemaster.com), and Top N States by Superstore Sales. Three data sources, three sets:

Screen Shot 2014-01-30 at 6.45.24 PM

And, of course, you can build calculated fields as well, here’s one in Superstore:

Screen Shot 2014-01-30 at 6.46.44 PM

Screen Shot 2014-01-30 at 6.48.00 PM

 

Conclusion

Here’s a link for the Tableau Public workbook for Sets with your Secondary.

With a little creativity and a whole lot of jokes at the maturity level of the average American teenage boy, we can get more out of data blending and sets. If you’d like Tableau to have more support for secondary sets out of the box (and generally treat secondary data sources more like primary data sources), please vote for http://community.tableausoftware.com/ideas/2773.

Now, go forth and use sets in strange places any position you want in new ways!

Table Calculations

See Want to Learn Table Calculations? Here’s How for a list of posts that I recommend.

From http://www.tableausoftware.com/support/forum/topic/trellis-plots comments by Joe Mako:

1. First step in table calcs is what kind of pills you have:
a. aggregated continuous
b. aggregated discrete
c. dimension continuous
d. dimension discrete

2. What shelf these pill types are on
– each of these types has a different effect on the evaluation of a table calc depending on their arrangement.
– when you have a dimension on both the rows and columns shelf, Tableau will padd for missing dinension combinations, unless you are using a set based on these dimensions
– building your view with all dimensions on the rows shelf is a good place to start, and once results are as expected, then move pills one at a time to other shelf, and see how shelf placement effects evaluation.

3. How you set your compute using
– if you use the compute using selection from the context menu, or the drop-down in the dialog, and set it to a dimension, that dimension will be used for addressing, and all others will be used for partitioning. I call this Regular Compute Using
– when setting the compute using to something like Table/Pane/Cell, I call that Layout Compute Using, and try to avoid this because I find it to be a delicate setting (move a pill, move a dimension pill, and your evaluation will likely be effected).

– with Regular Compute Using, if all addressing values do not exist in all partitioning values, the INDEX() value for each addressing value will not be the same across partition values. This is apparent with not all combinations of dimension values exist. (In this situation, there are states that have no sales records for some months, so whit your compute using set to State, and partitioning on Month, each State will not have the same INDEX() value in each month.)

– when using a Regular Compute Using and a crosstab of the dimensions in use (one on the Rows, and the other on the Columns), and no Set based on those dimensions, Tableau will pad the data making each State have the same INDEX() value for each Month.

– So without the crosstab setup, but with the desire for Tableau to pad the data to cause the INDEX() function to evaluate as desired, you can have Tableau pad the data as desired with an Advanced Compute Using.

– When you use multiple dimensions on the Compute Using (right-side list box) for addressing, I think of it as like a pseudo-crosstab of all potential combinations of the dimension values (this is what we want in this situation). If you use a Set instead of multiple dimensions, that data will not be padded.

It is my current belief that if you are aware of what pills you have in use, what shelf they are on, how your compute using is setup, and how all these potential combinations of setups effect the evaluation of table calcs, then table calcs are straightforward.

Without doubt, I have not included all information needed to understand table calcs here, but only some key concepts that play a factor in the attached “(fixed)” worksheets.

(end of post)

Another couple of posts by Joe:
http://community.tableausoftware.com/message/175545
http://community.tableausoftware.com/message/136628#136628

Joe’s notes on 6.1 to 7.0 differences:
http://community.tableausoftware.com/thread/117103?start=0&tstart=0

LOOKUP() vs. PREVIOUS_VALUE

See here: http://community.tableausoftware.com/message/173752

Also this from email from Joe Mako:
Have you attempted the formula that you recommended at
http://community.tableausoftware.com/message/173588#173588 ?

Instead of ATTR([Status]) != PREVIOUS_VALUE(ATTR([Status]))

try ATTR([Status]) != LOOKUP(ATTR([Status]),-1)

The PREVIOUS_VALUE() function is self referential, kind of like a recursive function or a loop. The argument you pass the function is just the starting value for the partition, while the LOOKUP() function gets the value based on an offset, and an offset of -1 means previous.

In your statement, the first mark will result in False, effectively [Status] != [Status], and every other record will also result in True because it is likely [Status] != “False” or [Status] != “True”

The formula ATTR([Status]) != LOOKUP(ATTR([Status]),-1) is basically saying:
“Compare the Status of the current mark and the Status of the previous mark in the partition.”

I recommend you try some trial and error with the PREVIOUS_VALUE() function

Try some formulas like:

PREVIOUS_VALUE(0)+1
PREVIOUS_VALUE(100)-1
PREVIOUS_VALUE(0)+SUM([Sales])
PREVIOUS_VALUE("")+ATTR([Product])+", "

Ignore in Table Calculations

http://www.tableausoftware.com/fr-fr/support/forum/topic/aggregating-aggregate-histograms
http://community.tableausoftware.com/thread/114442
http://community.tableausoftware.com/thread/110683

Lots of Table Calc Links

http://www.tableausoftware.com/table-calculations
http://www.tableausoftware.com/support/knowledge-base/table-calculations
http://www.tableausoftware.com/support/knowledge-base/pareto-analysis
http://www.tableausoftware.com/support/knowledge-base/compare-two-table-calcs-60
http://www.tableausoftware.com/support/knowledge-base/nestedsorting
http://www.tableausoftware.com/support/knowledge-base/sorting-concepts-related-quick-table-calculations
http://www.tableausoftware.com/support/forum/topic/plotting-custom-median-and-connecting-marks
http://community.tableausoftware.com/thread/111337 – multiple “reference” lines on the data using table calcs and/or duplicating data
http://www.tableausoftware.com/support/forum/topic/multiple-reference-lines – using average age for male & female reference lines, interesting table calcs here
http://www.tableausoftware.com/support/forum/topic/display-dimension-text-associated-measure – older discussion from 2010 showing use of multiple table calcs trying to get a minimum of a set of values for display, great comments from Ross Bunker

http://interworks.co.uk/blog/groundwork-for-custom-table-calculations/

 

Order of calculations:

http://community.tableausoftware.com/message/139603#139603

 

The following text from http://onlinehelp.tableausoftware.com/v7.0/pro/online/en-us/calculations_tablecalculations_understanding_addressing.html

Addressing and Partitioning

The addressing fields define what part of the table you are computing along. The partitioning fields define how to group the calculation. In the example of a running sum of product sales across several years, the addressing field is the Date field while the parititioning field is the product field. When youd define the addressing for a table calculation, all the other fields are used for partitioning.
You can specify the addressing in the Table Calculation dialog box. The addressing can be relative to the table structure or a specific field. Each addressing option is described below.
Table (Across)
This option sets the adressing to compute along the entire table moving horizontally through each partition.

From Table Calcs primer PDF:
• Partitioning: the scope or grouping of the calculation. This can be the full table, a pane, a cell, a dimension or it can be customized even further for more advanced calculations.
• Addressing: the anchor or the source of each partition. It defines the root of the calculation.
Consider, for example, looking at a running sum of sales by product across several years. In our example, the running sum is partitioned by product: every product’s sale is summed over time, so the result of the calculation is a running sum of coffee sales, tea sales, etc. The addressing field is the date field. With every new date, the sales of that data are added to the sum. When you define the addressing for a table calculation, all the other fields are used for partitioning.
Table (Across)
Table (Across) is a Calculate Along option. The chart below uses Table (Across) to set the addressing to compute along the entire table, moving horizontally through each partition.
When calculation addressing is set to Table( Across), the fields that span horizontally across the table are the addressing fields (Category and Region).

 

From: http://www.tableausoftware.com/support/forum/topic/version-6-table-calcs-understanding-few-basics#comment-27646
The fields specified in the ‘Compute Using’ menu or ‘Edit Table Calculation…’ dialog are termed the addressing fields. All of the dimensions on a sheet that are not addressing fields are partitioning fields. This corresponds to the ‘computed along’ (addressing) and ‘for each’ (partitioning) components of the description in the table calculation dialog. One other nice thing is that in the final v6, the tooltip for a table calculation will actually contain the description also, so you don’t have to open the dialog to understand what it is doing.

From http://www.tableausoftware.com/support/forum/topic/version-6-table-calcs-understanding-few-basics#comment-27736
‘Restarting every’ moves the given field and those above it in the Advanced… dialog to be partitioning. Similarly ‘At the level’ moves fields after it in the list to be partitioning, though there is a subtle difference in how that partitioning is done (its not done on value, but rather on postition within the partition, i’ll give more details on that in the tutorial).

One might ask why you would put something in the ordering list in Advanced… only to partition on it by setting ‘Restarting every’. The answer lies in the sorting behavior. For example, if you want to know the top products for each market, you can’t simply sort products by sales (which sorts based on sales for all markets). Instead, you sort Market,Product by sales. Then, when you partition on Market, the products within each of those Market partitions are still sorted by the sales within that market. Play around with it to see the difference.

My notes:
Addressing is the compute along/over/by.
Partitioning is the “grouping” or “scope” of the calculation – to some degree defined by the “context of the calculation, since All dimensions not part of addressing are used for partitioning. Also includes dimensions (and discrete pills) on Level of Detail shelf (and Color shelf too?)
‘Restarting every’ moves the given field and those above it in the Advanced… dialog to be partitioning.
Example of running sum of sales by product across several years:
product is partition, each new product creates a new returned value of sum of sales
date is address, sum of sales for each new date is added to the returned value for that partition
Where I’ve been confused (I think) is that there are three things in table calcs, not two: partitioning, addressing, and what is being calculated

More notes from convo with Joe on 3/2012

I don’t want to misrepresent Joe’s awesome knowledge here, if there are any mistakes in these notes from our conversation they are almost certainly mine.

Table (Across), etc. do a visual sort base on the layout of the view. The results from Table (Across), etc. can sometimes be the same as what you can get by using Advanced Compute Using, they aren’t always. For example, you can have a complex sort set up in the view and Table (Across), etc. will work just fine, but you can’t duplicate that sort using Advanced Compute Using because Advanced Compute Using creates a set of the fields in the Compute Using and then sorts that set along whatever is set in the Order Along.

This can be confusing because the English language description Tableau gives when you hover over the pill in the view can be the same for a Table (Across) and an Advanced Compute Using(), but the results will be different.

~~~Example using Superstore Sales:
Put Container on Rows, Category on Color, and Sales on Columns. Create an INDEX() function and see what it does try to duplicate that using an Advanced Compute Using

Show Me can be used to “freeze” a setup, typically by Table (Across) or Table (Down). Can be used to get to a view that you can’t get w/Advanced…Compute Using. “Table (Down) is magic dust with Show Me.” ~~~Need an example here, probably one with a funky INDEX().

Roll Your Own Index

IF ATTR([Field]) = LOOKUP(ATTR([Field],-1) THEN
 PREVIOUS_VALUE(0)
ELSE
 PREVIOUS_VALUE(0)+1
END

This creates a rank that shows ties, or can be used to ignore something in a table calc.

~~~Example:
Rows: Roll Your Own Index (set Compute Using to Container), Container (sorted on Sum(Sales) Asc) Color Shelf: Category

This allows the index to be computed on container but ignore category for sort and partitioning. This would be a nice feature.

Padding

Using dimensions on both Rows and Columns shelves causes Tableau to pad, while dimensions only on Rows doesn’t cause Tableau to pad.

~~~Put Container on Rows, Category on Cols, see what INDEX() does Put Container on Rows, Category on Rows, see what INDEX() does – will see different marks used now Compute Using set to Across then Down vs. Down.

next example: Put Customer, Order Date on Rows. Index() on Text Index on Customer – ranks them Index on Order Date – causes padding, Tableau takes a really long time to return data. If you use MONTH(Order Date) it doesn’t take as long.

the key issues w/padding are:
Aggregate vs. dimension
continuous vs. discrete
datepart vs. datetrunc

Some details w/display can get lost when using Primary/Secondary sources. example: duplicate Superstore sales, when Customer from both is in the Customer from secondary does not show ATTR(), i.e. doesn’t show that it’s an aggregate. (thought here is to outline the pills)

How Joe makes table calcs work

1. put every dimension on Rows shelf
2. Measure Names as Columns, Measure Values as text 3. Do addressing on the right-most pill

This way, can turn on totals and subtotals to check what they return vis-a-vis the table calcs. Also can see the partitioning and padding. Using a full-blown cross tab makes it more confusing. Also allows filters using table calcs to be tested since all dimensions are available on the Rows shelf.

Filtering Notes

Table calc filters can only see what’s on Columns and Rows shelves, and/or see their own instance on the Level of Detail, so if needed fields for the calc aren’t on Columns or Rows you can put an instance of the table calc on the Level of detail shelf.

Another advantage to using Joe’s method for making table calcs work is that table calc filters can be tested as well.

Aggregated/continuous pills can go on Filter shelf Aggregated/discrete pills can’t go on Filter shelf Table calc/continuous or discrete pills can go on Filter shelf

Example of this
Create SUM(1) calculated field, call it “Test Agg”. As Continuous, it can go on Filter shelf. As Discrete, can’t go on filter shelf. Create another calculated field LOOKUP([Test Agg],0). As Continuous or Discrete, it can go on the Filter shelf.

(end of Joe notes)

More on At the Level and Restarting Every (Alex Kerin question leading to Ross Bunker post)

http://community.tableausoftware.com/message/178239#178239

 

Table calc – result only needed for one row in partition:

From http://www.clearlyandsimply.com/clearly_and_simply/2011/01/another-look-at-site-catchment-analysis-with-tableau-6-part-3.html

IF (FIRST()==0) THEN
   WINDOW_SUM(SUM([Population]), 0, IIF(FIRST()==0, LAST(), 0)) 
END

Don’t try this with TOTAL(), it can mess things up

Table calc – result needed for all rows in partition (same answer for each case)

From http://www.clearlyandsimply.com/clearly_and_simply/2011/01/another-look-at-site-catchment-analysis-with-tableau-6-part-3.html

PREVIOUS_VALUE(WINDOW_SUM(SUM([foo]), 0, IIF(FIRST()==0, LAST(), 0)))

Probably shouldn’t try this with TOTAL() either
TOTAL() hits DB while WINDOW_SUM() does not, would probably affect order of filters? TOTAL() causes any enclosed aggregate functions to be evaluated over each partition – but I’m not really sure why “Table Across” behaves differently to
WINDOW_XXX() functions.

Uncheck Ignore in Table Calculations so discrete field can be used in a table calc:
http://community.tableausoftware.com/message/173434#173434

Idea for Table Calcs:
Duplicate a dimension and put the copy on the LoD shelf so calcs can be along that, but summed overall.
http://community.tableausoftware.com/message/174254#174254

Index (roll your own)
From: http://www.tableausoftware.com/support/forum/topic/ignoring-order-along-advanced-table-calc#comment-44787
Based on: http://www.tableausoftware.com/support/forum/topic/top-x-shows-more-x-values-when-using-color-shelf

IF LOOKUP(MIN([Product]),-1)=MIN([Product]) THEN 
   PREVIOUS_VALUE(0)
ELSE
   PREVIOUS_VALUE(0)+1
END

Discussion of this and partitioning (Ross Bunker)
http://community.tableausoftware.com/message/183504#183504

Using PREVIOUS_VALUE to combine multiple rows into one
Example here is to make a list of names like Jane, John, Joe, etc. from multiple rows.
http://community.tableausoftware.com/thread/118259

TODAY() and NOW() generate Internal Expression Error inside table calc
As of Tableau v7, the following formula generates an “Internal Expression Error: Function ‘TODAY’ is not defined in the current context” error.
WINDOW_SUM(IF ATTR([Order Date]) >= TODAY()-29 THEN WINDOW_SUM(SUM([Sales])) END)
The same happens when NOW() is used instead of TODAY(). The solution is to make a calculated field with NOW() or TODAY(), and then use the calculated field in the function.

 

Given an index to another row, a formula to get an offset to that row from the current row, to use in a LOOKUP() statement:

FIRST()+[the index] -1

Improving table calc partitioning & addressing
http://community.tableausoftware.com/ideas/1189

[loop category=”wikicontent” tag=”tc,tablecalcs,table-calculations”]
[field title] – Added [field date]
[content]
[/loop]

Related posts:

    [loop tag=”densification,address,addressing,tc,table-calcs,table-calculations,aggregation,index,padding,partition,partitioning,PREVIOUS_VALUE(),rank,ranking,at-the-level” exclude=”this” relation=”and” compare=”not” taxonomy=”category” value=”wikicontent”]

  • [field title-link][field thumbnail-link]

[/loop]

Preparing Data

Key bits on migrating from Excel

http://kb.tableausoftware.com/articles/knowledgebase/preparing-excel-files-analysis
http://kb.tableausoftware.com/articles/knowledgebase/addin-reshaping-data-excel

anonymizing data for Tableau

http://community.tableausoftware.com/thread/118843

DATE() and STR() functions behave differently depending on the data source (Excel/text/Access vs. extract)

http://community.tableausoftware.com/message/179624

DATEPARSE() for converting different date formats

DATEPARSE-ing Tips for Successful Speed Dating with Tableau 8.1

 

Joining Vs. Blending

http://www.tableausoftware.com/support/knowledge-base/join-vs-relationship-60
http://www.tableausoftware.com/support/knowledge-base/summary-and-detail-60

When one data source has subcategory & measures, and another has category & subcategory, can use Primary Groups to effectively create the category within the first data source. KB has article on this, search for “creating primary groups from a secondary source”.

Data blending and getting aggregations right:

Workbook with three options (extract, control data relationships for blend, custom SQL) by Jeff Mills:
http://community.tableausoftware.com/message/183888#183888
http://community.tableausoftware.com/thread/118319

Creating a Tableau Data Source .tds for connecting to just the database and not the table by Craig Bloodworth:
http://www.theinformationlab.co.uk/2012/11/26/connect-to-a-database-without-specifying-a-table-video/

Outer joining multiple Excel tables
https://www.interworks.com/blogs/tmccullough/2013/06/26/outer-joining-multiple-excel-tables-tableau-8

Custom datasource config files:
http://community.tableausoftware.com/message/227764

[loop category=”wikicontent” tag=”preparing,reshaping”]
[field title] – Added [field date]
[content]
[/loop]

Related posts:

    [loop tag=”preparing,reshaping,anonymizing,blinding,reshape,reshaper” exclude=”this” relation=”and” compare=”not” taxonomy=”category” value=”wikicontent”]

  • [field title-link][field thumbnail-link]

[/loop]

Partitioning by Table Calculations

Here’s a issue on the Tableau forums that shows up at least once per month or so: someone has a data set that they want to compute a measure over, and the measure requires table calculations. Then they want to use the results of those calculations as a discrete value for partitioning other calculations. For example, given a set of student grades, compute each student’s percentile, then show the average of each percentile in the same view. Or a similar idea, only the percentile is based on sum of sales, and you want to show the top 10% and bottom 10% of sales. Continue reading