Bins

Easy bin building:

– select a measure
– click on Show Me!
– pick the histogram

Tableau tries to create roughly 10 bins, adds a dimension for the bin to the dimensions list
can change bin size

Alternate for Bins (allows for reference lines on a histogram):

– Create a calculated field such as: INT([Sales]/100)*100-IIF([Sales]<0,100,0)
– Put Number of Records on Columns, calculated field on rows.
– Then can have calculated fields like: TOTAL(AVG([Sales])), TOTAL(MEDIAN([Sales])), etc.
– Will need to set Compute using to the calculated field.

Example here: https://www.tableausoftware.com/support/forum/topic/reference-line-over-distribution-graph

Some stuff on bins & blends (might need this to pull off getting the percentile)
http://reports4u.co.uk/2011/08/17/tableau-create-bins-from-a-measure-calculated-field-pt-2/
http://www.tableausoftware.com/support/forum/topic/creating-bin-dimension-aggregated-calculated-field
http://www.tableausoftware.com/support/forum/topic/aggregating-aggregate-histograms

From Joe Mako:

BYO Bin calc:

INT([Value]/[Bin Size])*[Bin Size]-IIF([Value]<0,[Bin Size],0)

Manually choosing the number of bins/binning a dimension by a measure:

http://kb.tableausoftware.com/articles/knowledgebase/numberofbins

[loop category=”wikicontent” tag=”bins,bin,histogram,histograms”]
[field title] – Added [field date]
[content]
[/loop]

Related posts:

    [loop tag=”bins,bin,histogram,histograms” exclude=”this” relation=”and” compare=”not” taxonomy=”category” value=”wikicontent”]

  • [field title-link][field thumbnail-link]

[/loop]

9 thoughts on “Bins

  1. Lari

    That’s fantastic! Is there any way that you’re aware of to tie in that BYO Bin calc to a range label for the bins? (i.e., to make the axis say 0-4, 5-9, 10-14 instead of 0, 5, 10) If you use a parameter to let the reader update the bin definitions, manually editing the alias of the bin label into ranges doesn’t work with new bin definitions.

    Reply
  2. Jonathan Drummey Post author

    Hi Lari,

    You could do something like this to make the label dynamic:

    STR([BYO Bin]) + “-” + STR([BYO Bin]+[Bin Size]-1)

    Reply
  3. DK

    Hi,

    Am looking for a solution related to bins but not sure if it is anywhere related to what is mentioned above. My requirement is to create a Stacked column graph showing Sales and no of customers per deal size cohort. The deal size is 0-3k, 3-10k, 10-40k, 40-100k and 100+

    Any help on above is appreciated.

    Reply
    1. Jonathan Drummey Post author

      Hi DK,

      It depends on whether the deal size is a record-level value or aggregate. If it’s record level then the calc would be something like

      IF [dealsize] > 100000 THEN
          '100+'
      ELSEIF [dealsize] > 40000 THEN
          '40-100k'
      [skipping...]
      ELSEIF [dealsize] >=0 THEN
          '0-3k'
      ELSE
          'no deal size established'
      END

      If the deal size is an aggregate measure (let’s say by customer) then you’d replace [dealsize] in the above with something like {FIXED [CustomerID] : SUM([dealsize])}.

      Jonathan

      Reply
  4. RM

    Hi Jonathan, my colleague and I are both pretty new to Tableau, and we are hoping you might be able to point us in the right direction with an issue we’re dealing with.

    Is it possible to set a minimum lower limit for computing a category/set/bin/group (basically any grouped data) in a visualization? We need this to be at the page level, so no filtering/drill-down on the page falls below that minimum size. We’d also like for that process of creating bins or hierarchies in data to roll up until we’ve met a minimum group size and maybe just call the smallest group “Other”. Do you know if this kind of constraint is possible in Tableau? We work with protected health data and don’t want to have bins small enough it could be possible to identify someone in a small bar/bin/set/hierarchy, etc. My colleague thinks we might have to do this in the database, but I’m wondering if this could be possible with an LOD?

    Thanks so much for your thoughts on this. Any ideas/links would be much appreciated.

    Rachel

    Reply
    1. Jonathan Drummey Post author

      Hi Rachel,

      There are two different use cases here:

      • Don’t show any X that have/are less than N
      • Bin any X that are less than N into Other

      I can provide pretty straightforward answers for the variations on the first use case, the second would require a little more information from you. The way I think of the first use case is “don’t show any X that have/are less than N”, and below I’ve written about the two most common examples I see and I put them together in a Tableau Public workbook: Tableau Public

      Don’t show any X that have/are less than N

      In this case I’ll do this with a pill on the Filters Shelf, either a regular aggregate pill or a table calculation. The The basic idea is that the viz is at a given level of detail (the vizLOD) and then we want to filter out results at a certain level of detail, here are the two most common examples, I put them together in a Tableau Public workbook: Tableau Public

      vizLOD Does not Include the Filter Criteria Dimension(s) – Using a Regular Aggregate

      The use case here with Superstore is wanting to display customers while filtering out any customers with less than 3 orders. In a healthcare setting this might be a view looking at primary diagnoses and patients where we’d want to filter out any diagnosis with

      In this view the vizLOD is Customer ID, and given that the way we can count Orders is using the regular aggregate COUNTD([Order ID]). We can’t use SUM([Number of Records]) in this case because the grain of the Superstore data is at the order item level so SUM([Number of Records]) would be incorrectly counting order items and not orders. So given that all we need to do is put COUNTD([Order ID]) on the Filters shelf with a minimum of 3.

      I did a demo of this in the “Remove Customers in Segments” sheet in the linked workbook.

      vizLOD Includes the Filter Criteria Dimension(s) – Using a Table Calculation

      In this use case with Superstore we’re wanting to display Segments and Customers while filtering out any Segments with less than 3 customers. In healthcare this could be a view with a survey segment and blinded per-patient info where even after blinding if there are less than three patients we don’t want to display that survey segment.

      In this view the vizLOD is Segment and Customer ID so we can’t use a regular aggregate to count customers because we need the count to happen at the level of Segment. The fastest calculation to use in this case case is SIZE(), this is a table calculation that returns the number of marks/addresses/rows in the partition. So with a Compute Using on Customer ID it’s returning the number of customers for each Segment, then we can put that on the Filters shelf with a minimum of 3.

      You can see an example of this in the “Remove Customers for Orders” sheet in the attached.

      Bin any X that are less than N into Other

      The second major use case is where we’re trying to roll up X values into a minimum bin size and display that as Other. There are multiple ways to interpret this set of requirements that lead to different techniques to get to the desired view based on the combination of the vizLOD, the criteria, and how/where the N-size is determined.

      I’ll use an example of Practices, Physicians, and Patients to write out the questions, if you can give me specifics I’d be glad to put together an example when I can.

      1) The first question for this use case is what are you trying to display? For example are you showing just Practices, Practices and Physicians, or Practices & Physicians & Patients?

      2) The second question is what are the criteria? For example is it to only show Practices with >N Physicians, Practices with >N Patients, Physicians with >N Patients, etc.?

      3) There are two ways we can interpret the N-size for how the smaller groups are bundled into Other.

      a) Is the N-size fixed at a given (lower) level? For example any Physician with <=N patients should be rolled up into Other.

      b) Is there a roll-up of the lower level until some N-size is met? This is usually defined as a percentage or percentile. For example the minimum N is 5th %ile, so any Physician with below 5th %ile of patients should be rolled up into other. Depending on the answers to 1) and 2) this can be the most complicated to solve.

      Reply
  5. RM

    Hi Jonathan, Thank you so much for this very detailed response! I think I’ve almost got it.

    The first Use Case: “Don’t show any X that have/are less than N” nails it. I think that should cover my basic worry was that for example: We have patients with claims over a certain amount, and that bucket gets to be less than the HIPAA limit. Or we have a diagnosis that less than a certain number of patients has been given, and the bucket gets too small again. As long as SEGMENT (in this case whatever category we are talking about be it demographics, disease profiles, risk scores, diagnoses, average monthly claims, etc. is so small that it would be possible to identify a patient, we have to mask it from view.

    Quick question: Is it possible to limit a Tableau a Filter so that it’s filtering by that filter but only the workbook creater/editors can see that filter? In the case of your example, we’d have to hide ‘size()’ so someone couldn’t remove it from the view.

    In the case of the second use case, I first thought I was going to say “either’, but the more I think about it, I think the case is really B. We need to roll up the lower level until some N-size for patients is met.

    How would you suggest this?

    The key is that we can never go less than a certain count of N patients. In our case, For example, we can’t go below X number of patients with a certain diagnosis, X number of patients with total claims above a threshold, X number of male patients, older patients, etc. In the case of an employer, they might want to break down their medical claims across insurance plans by various demographics and conditions, but they would be restricted from displaying that info if the number of patients were low enough that someone could figure out which employees they were.

    I believe we have to stick with Count, because sometimes the % would get below that count. Statistically speaking the ability to roll up to the percentile is intriguing, though. I can think of other uses for that with modeling.

    Thanks so much for all your help! We really appreciate it.

    Rachel

    Reply
    1. Jonathan Drummey Post author

      Hi Rachel,

      Sorry for the delay, your reply somehow ended up in my spam filter.

      re: the Quick question: If the Filter is on the Filters Shelf and no Quick Filter is displayed then only users who have edit permissions would be able to change it. So regular Tableau Server users and Tableau Reader users wouldn’t see the filter at all.

      As for choosing B that’s the tricky one because you’d need to create a running count to set the threshold and that can only be done with table calculations in Tableau, and/but doing that would require that whatever dimension is being counted (like the patient ID) to be on the level of detail of the viz, which means that all other calculations need to be table calculations and doing things like counts gets a lot more complicated. It’s do-able, but tricky.

      Jonathan

      Reply
  6. Pingback: Bin There, Done That: A Discussion of Bins in Tableau | Wirandle | Wirandle By ? L Á Y ? M Í

Please add your thoughts and perspectives