#data14 – Start Your Blenders! | Drawing with Numbers

I’m going to plug some sessions for the 2014 Tableau Conference, if you want to promote yourself please add a comment below!

Getting your blend on has never been easier

At the 2014 Tableau Conference there’s a whole track worth of sessions on data blending by some fabulous folks, with my comments in italics.

Mix It Up: Data Blending Basics by Alex Woodcock of Tableau. Beginner, Wednesday 10:45am-1:00pm and Thursday, 10:45am-1:00pm. If you’ve never blended before, this is the class for you.
What’s In Your Blender by Charles Schaefer and Kelly Hotta of Tableau. Advanced, Tuesday, 11:15am-12:15pm. Tips and tricks for data blending.
Jedi Calculation Techniques by Bethany Lyons and Alan Eldridge of Tableau. Jedi, Tuesday, 11:15am-1:30pm Room also Wednesday 3:30-6pm. Covers when blending might be used among lots of other non-blending topics in the 2hr session.
Become a Mix Master with Data Blending by Bethany Lyons of Tableau. Jedi, Tuesday, 2:30-3:30pm and Wednesday, 10:45-11:45am. Bethany gave this presentation at the London conference, it covers how blending works in more detail.
Mix and Match Your Data: Advanced Data Blending by Alex Woodcock of Tableau, Advanced,Tuesday, 2:30pm-5:00pm and Wednesday, 3:30-6pm. 2hr training to bootstrap yourself from basic to more advanced knowledge of data blending.
Flowing with Tableau by Joe Mako (Tableau guru to the gurus), Jedi, Wednesday, 12-1pm. See how Joe approaches Tableau and conceives of the solutions that he does, he gave a similar talk in California this summer.
Extreme Data Blending by Jonathan Drummey (yours truly), Jedi, Wednesday, 3:30-4:30pm. See below.

I’m energized about all of these sessions, especially Joe Mako’s. It’s not so much tips and tricks, but instead how to “think Tableau” and work with the software. I’ve used the metaphor of a structured poem before, in that when writing something like a sonnet we have certain conventions to follow, and as long as we do we can have lovely results, the same goes with Tableau in how we structure the data and use the different features and functions in the software.

My own session on Extreme Data Blending mashes up South Park and Frozen in a deep dive into how data blending works. I’m excited to share what I’ve learned, especially how that every single odd, strange, or seemingly broken result of data blending actually has a logic and reasoning behind it that can be understood, explained, and even made use of. If you’re new to data blending and want to attend my session, I suggest you go to one of the other sessions to get grounded in blending behaviors. If you’ve been using blending already, I promise you you’ll learn something new, though if you’ve read all of my posts on data blending then some of the use cases will be familiar. If you’re already a Tableau Jedi, you’ll like this session because I’ve purposely created it to start out with a review of known territory, then we’re going Jedi++.

A few other sessions and meetups that I’d like to plug are:

First timers and conference newbies – Emily Kund and Matt Francis (they host the one and only Tableau Wannabe Podcast, totally worth a listen) are hosting a conference orientation session Monday at 4pm, before the welcome reception. They then repeat that First Timers’ Field Guide session Tuesday at 11:15am
Tableau Community Meetup – Wednesday, 12:30-2pm in Community Alley. Here’s your chance to meet in real life Tracy, Patrick, and Jordan who run the Tableau forums along with assorted other forum helpers.
Meet the Tableau Zen Masters – Besides my session, the one place you can definitely find me (ok, besides stalking Neale Degrasse Tyson and Hans Rosling for selfies) is here on Wednesday at 6pm, though I’m not sure yet where “here” will be.
Women in Data Meetup – Jenn Day and Anya A’Hearn are hosting this meetup on Tuesday at 12:45pm in the University Room at the Sheraton. I think it’s fantastic Jenn & Anya are hosting this and that Tableau is supporting the meetup, see #womenindata for more on the topic. We all need to find our tribes, maybe this is yours!

See you in Seattle!

13 thoughts on “#data14 – Start Your Blenders!”

Matt Lutton August 29, 2014 at 2:29 pm

This is fantastic – thanks for sharing Jonathan!

Goodwill Education Initiatives will be presenting at 10:30am on Wednesday 9/10/14 on the topic of “How Goodwill is Using Tableau to Rethink Data in Education”:

http://tcc14.tableauconference.com/schedule/wednesday#session-786

This session is not intended as a learning opportunity for technical Tableau tips and tricks, but rather as a “Customer Story”—an overview of what we’ve done well, and how we’ve progressed since adopting the tool. There will be some demo dashboards shown, so users in education may be interested in seeing the types of dashboards we have been creating for various types of end-users.

Reply ↓

Matt Lutton August 29, 2014 at 2:31 pm

My mistake… that should be 10:45-11:45am on September 10th.

Reply ↓

Xan Gregg August 31, 2014 at 3:58 pm

Hi Jonathan, Your first link has the wrong domain name for the conference site. And the link on your name is dead: http://drawingwithnumbers.artisart.org/author/JonathanDrummey/, so I wasn’t sure how else to notify you. Delete this comment when corrected.

Reply ↓

Jonathan Drummey Post authorSeptember 25, 2014 at 6:12 am

Hi, I’m just catching up here post-conference. I updated the link to tc14. I can’t find which link on my name you were referring to, can you point me at it?

Reply ↓

Andy Peters September 15, 2014 at 12:53 pm

Thanks so much for your session at the conference! It’s helping me debug a long-standing blending problem even as I type!

Reply ↓

Jonathan Drummey Post authorSeptember 25, 2014 at 5:57 am

You’re welcome!

Reply ↓

Elizabeth Wallace September 29, 2014 at 1:08 pm

Hi Jonathan…. I enjoyed your session at the Tableau conference (Extreme Data Blending). I see a video on their site, but was wondering if you could post the slides somewhere accessible (or email them to me). They would be a great reference!

Reply ↓

Jonathan Drummey Post authorOctober 1, 2014 at 5:45 am

Hi Elizabeth, I’ll be posting them (and my workbook) in the next week.

Reply ↓

Bill Lyons February 20, 2015 at 8:12 pm

I just watched your “Extreme Data Blending” video from TC14. Excellent work! Glad I could keep backing it up because it was like drinking from a fire hose, especially at the end!

At approx. 1:03:30 into the video, you got a question that I could not hear, but the answer seems to be related to performance with a blend. You said it depends on the “granularity of the dimensions you are blending on.” You said that if it were 5 values, it would be “really fast,” and if it were 500,000 values it would be “really slow.”

Can you clarify this for me? I understood “granularity” to indicate the number of fields that are being “linked” between the data connections, or the number of dimensions in the view. Is that the same as your use of the term “values?” I guess the big question, which came up in a Forum post today, is whether the size of the data set is a significant consideration for whether a self-blend will perform well. I had heard before that duplicating a data connection does not duplicate the data. But in your presentation, it appears that at the final level, there may be a larger data set.

Thank you so much for your help!

Reply ↓

Jonathan Drummey Post authorFebruary 21, 2015 at 8:02 am

Hi Bill,

“Granularity” is a bit of a fuzzy concept (at least in my mind) because it encompasses the fields, the relationships between the fields, and the relative sparseness of the domains (set of values) between the fields. The more accurate term in this case is “cardinality” which refers to the number of elements in a set, in this case the # of distinct values of the linking dimension(s). So if the cardinality of the blend is 5 values, the blend should be fast, but if the cardinality were 500K values then Tableau has to do a lot more blending operations and that will be slow. This is separate from the “size of the data set” that I assume to mean the number of records. Since the data blend occurs *after* aggregation, the performance questions have more to do with the granularity of the blend (i.e. the linking dimensions and their cardinality) vis-a-vis the granularity of the view (i.e. the dimensions in the view and their cardinality) along with the complexity of any aggregations and calculated fields. For example, I might have massive database but only be showing the total purchased per region for a month, in that case the granularity of the view is the # of regions. A tuned billion row database blending on 5 values could very well be faster than a 10K Excel file blending on all 10K values.

Does that answer your question?

Jonathan

Reply ↓
1. Bill Lyons February 21, 2015 at 6:21 pm
  
  Considering your renown expertise, seeing you say “granularity is a fuzzy concept” to you, is very reassuring to me! Nice to know that you are human, and that it is challenging even for experts!
  
  Your response does help considerably. I’m understanding more, little by little.
  
  If a particular problem can be solved by either a table calculation (or combination thereof) or a self-blend, do you have any data, or even a gut feel, indicating which is likely to perform better, and what factors might affect it?
  
  Thank you again for your help, time and patience!
  
  bl
  
  Reply ↓
  1. Jonathan Drummey Post authorMarch 6, 2015 at 7:31 am
    
    Hi Bill,
    
    Like many things in Tableau, the answer is, “it depends”. Major factors include:
    
    – speed of the data source(s)
    – speed of your Tableau Server
    – speed of your network
    – how many query results there are
    – how many marks you ultimately need
    – complexity of the calculations
    – whether there are string manipulations
    – complexity of the data blend
    – cardinality of the blend, i.e. the number of distinct blended results
    
    Table calcs can often be faster because there are fewer queries to the data source(s), but if you’ve got a view where there need to be 100K query results for a table calc to operate over and you’ve got a slow network, then moving those calcs into a blended source might well be faster. Or if you’re needing to do some complicated string manipulation, then the highest Tableau performance might be achieved by offloading all of those calcs into your data source. Like I said, it depends!
    
    Reply ↓
    1. Bill Lyons March 6, 2015 at 12:51 pm
      
      I understand. Thank you for your time and knowledge.

Matt Lutton August 29, 2014 at 2:29 pm

This is fantastic – thanks for sharing Jonathan!

Goodwill Education Initiatives will be presenting at 10:30am on Wednesday 9/10/14 on the topic of “How Goodwill is Using Tableau to Rethink Data in Education”:

http://tcc14.tableauconference.com/schedule/wednesday#session-786

This session is not intended as a learning opportunity for technical Tableau tips and tricks, but rather as a “Customer Story”—an overview of what we’ve done well, and how we’ve progressed since adopting the tool. There will be some demo dashboards shown, so users in education may be interested in seeing the types of dashboards we have been creating for various types of end-users.

Reply ↓
1. Matt Lutton August 29, 2014 at 2:31 pm
  
  My mistake… that should be 10:45-11:45am on September 10th.
  
  Reply ↓
Xan Gregg August 31, 2014 at 3:58 pm

Hi Jonathan, Your first link has the wrong domain name for the conference site. And the link on your name is dead: http://drawingwithnumbers.artisart.org/author/JonathanDrummey/, so I wasn’t sure how else to notify you. Delete this comment when corrected.

Reply ↓
1. Jonathan Drummey Post authorSeptember 25, 2014 at 6:12 am
  
  Hi, I’m just catching up here post-conference. I updated the link to tc14. I can’t find which link on my name you were referring to, can you point me at it?
  
  Reply ↓
Andy Peters September 15, 2014 at 12:53 pm

Thanks so much for your session at the conference! It’s helping me debug a long-standing blending problem even as I type!

Reply ↓
1. Jonathan Drummey Post authorSeptember 25, 2014 at 5:57 am
  
  You’re welcome!
  
  Reply ↓
Elizabeth Wallace September 29, 2014 at 1:08 pm

Hi Jonathan…. I enjoyed your session at the Tableau conference (Extreme Data Blending). I see a video on their site, but was wondering if you could post the slides somewhere accessible (or email them to me). They would be a great reference!

Reply ↓
1. Jonathan Drummey Post authorOctober 1, 2014 at 5:45 am
  
  Hi Elizabeth, I’ll be posting them (and my workbook) in the next week.
  
  Reply ↓
Bill Lyons February 20, 2015 at 8:12 pm

I just watched your “Extreme Data Blending” video from TC14. Excellent work! Glad I could keep backing it up because it was like drinking from a fire hose, especially at the end!

At approx. 1:03:30 into the video, you got a question that I could not hear, but the answer seems to be related to performance with a blend. You said it depends on the “granularity of the dimensions you are blending on.” You said that if it were 5 values, it would be “really fast,” and if it were 500,000 values it would be “really slow.”

Can you clarify this for me? I understood “granularity” to indicate the number of fields that are being “linked” between the data connections, or the number of dimensions in the view. Is that the same as your use of the term “values?” I guess the big question, which came up in a Forum post today, is whether the size of the data set is a significant consideration for whether a self-blend will perform well. I had heard before that duplicating a data connection does not duplicate the data. But in your presentation, it appears that at the final level, there may be a larger data set.

Thank you so much for your help!

bl

Reply ↓
1. Jonathan Drummey Post authorFebruary 21, 2015 at 8:02 am
  
  Hi Bill,
  
  “Granularity” is a bit of a fuzzy concept (at least in my mind) because it encompasses the fields, the relationships between the fields, and the relative sparseness of the domains (set of values) between the fields. The more accurate term in this case is “cardinality” which refers to the number of elements in a set, in this case the # of distinct values of the linking dimension(s). So if the cardinality of the blend is 5 values, the blend should be fast, but if the cardinality were 500K values then Tableau has to do a lot more blending operations and that will be slow. This is separate from the “size of the data set” that I assume to mean the number of records. Since the data blend occurs *after* aggregation, the performance questions have more to do with the granularity of the blend (i.e. the linking dimensions and their cardinality) vis-a-vis the granularity of the view (i.e. the dimensions in the view and their cardinality) along with the complexity of any aggregations and calculated fields. For example, I might have massive database but only be showing the total purchased per region for a month, in that case the granularity of the view is the # of regions. A tuned billion row database blending on 5 values could very well be faster than a 10K Excel file blending on all 10K values.
  
  Does that answer your question?
  
  Jonathan
  
  Reply ↓
  1. Bill Lyons February 21, 2015 at 6:21 pm
    
    Considering your renown expertise, seeing you say “granularity is a fuzzy concept” to you, is very reassuring to me! Nice to know that you are human, and that it is challenging even for experts!
    
    Your response does help considerably. I’m understanding more, little by little.
    
    If a particular problem can be solved by either a table calculation (or combination thereof) or a self-blend, do you have any data, or even a gut feel, indicating which is likely to perform better, and what factors might affect it?
    
    Thank you again for your help, time and patience!
    
    bl
    
    Reply ↓
    1. Jonathan Drummey Post authorMarch 6, 2015 at 7:31 am
      
      Hi Bill,
      
      Like many things in Tableau, the answer is, “it depends”. Major factors include:
      
      – speed of the data source(s)
      – speed of your Tableau Server
      – speed of your network
      – how many query results there are
      – how many marks you ultimately need
      – complexity of the calculations
      – whether there are string manipulations
      – complexity of the data blend
      – cardinality of the blend, i.e. the number of distinct blended results
      
      Table calcs can often be faster because there are fewer queries to the data source(s), but if you’ve got a view where there need to be 100K query results for a table calc to operate over and you’ve got a slow network, then moving those calcs into a blended source might well be faster. Or if you’re needing to do some complicated string manipulation, then the highest Tableau performance might be achieved by offloading all of those calcs into your data source. Like I said, it depends!
      
      Reply ↓
      1. Bill Lyons March 6, 2015 at 12:51 pm
        
        I understand. Thank you for your time and knowledge.

Drawing with Numbers

Thoughts on data visualization and Tableau

#data14 – Start Your Blenders!

Getting your blend on has never been easier

Like this:

Related

13 thoughts on “#data14 – Start Your Blenders!”

Leave a Reply to Jonathan Drummey Cancel reply

Getting your blend on has never been easier

Share this:

Like this:

Related

13 thoughts on “#data14 – Start Your Blenders!”

Leave a Reply to Jonathan Drummey Cancel reply