Category Archives: Visualizations

Visualizations both Serious and Whimsical.

Text Wrapping within Tableau Panes

I recently got a request for help with a wrapping challenge. Not holiday presents, instead text wrapping in Tableau. Here’s a demo view illustrating the problem:

There’s a whole bunch of ellipses (…) where there’s too much data to display. The only options Tableau gives us in this case are to:

  • Change the order of the text by sorting the Product Name dimension.
  • Manually change the size of each pane by resizing the headers.
  • Use the Fit options to dynamically Fit Width, Fit Height, or Entire View.

The manual sizing is problematic because it won’t dynamically adjust to the number of marks, and in the case of views with lots of marks like this one it takes a lot of effort to figure out what size will get all the marks, never mind that the list of values is really hard to read:

And while the Fit options are great at ensuring a view with only a few marks takes up the available space when there are many marks it ends up either not displaying values or creating overlapping values depending on the settings.

Controlling X + Y

In this view the mark layout–either no pills or only discrete (blue) pills on Rows and Columns– is generating a pane for each distinct combination of header values and then Tableau’s mark stacking algorithm is laying out the text. So at this point we’re stuck and can’t do anything about what Tableau is up to. This is where we need to keep in mind one of the master Tableau concepts: everything is a scatterplot. If Tableau won’t place the marks where we want them then we can generate our own X + Y coordinates whether by adding data or creating our own calculations. This is the approach taken by the tile maps introduced to Tableau by Brittany Fong or or as a model of the solar system that I made awhile back: . More recently Ken Flerlage did a great introduction on his Beyond Show Me series of blog posts.

Therefore “all” we need to do is figure out where to place the marks. More on that in a moment, there are two more details I want to go into:

Green Panes vs. Blue Panes: Pane Sizing

Tableau’s logic around what generates a pane vs. marks in a pane is a little complicated so I’m going to keep this focused on three key elements, here are the first two:

  1. All panes created by a given combination of pills are the same size.
  2. (corollary to #1) If we resize one header or axis then all the other panes for that header or axis will resize as well. Tableau does this because it’s easier to visually parse (read) a view that has consistent sizing of elements.

Here’s a view with COUNTD(Product Names) on Text & Color with just discrete (blue) pills on Rows and Columns:

Somehow we need to fit 509 Product Names into the pane for Q4 2015/Office Supplies. If I resize Office Supplies to be taller then both Furniture & Technology change as well:

The same goes if I’ve got a continuous axis to place X/Y coordinates on. In this view I’ve simply put MIN(0) and MIN(1) on Columns & MIN(0) Rows and we can see a set of axes:

If I resize MIN(1) on Columns then to make it wider then all of the panes for MIN(0) and MIN(1) on Columns are resized.

So we can’t really dynamically resize panes to fit the data, all we can do is fit more or less into a pane. Therefore the desired solution can’t involve resizing panes, instead we will need to be generating more or fewer panes, and that leads to the next point around panes.

Green Panes vs. Blue Panes: Number of Panes

The third key element around green panes and blue panes is this:

  1. a) Continuous pills generate an axis for every discrete pill header. b) Discrete pills generate a header for every value of the pill.

We can see a) in action in the continuous views above, with MIN(0) and MIN(1) on Columns we get two axes for each quarter/year combination. So to add more axes we’d need to add more continuous pills but we can’t dynamically add them, and the number of axes ultimately depends on the discrete pills anyway so discrete is the way to go.

We can see b) in the discrete views above, there’s a header for each quarter in each year. Where this gets a little more interesting (and more useful in our case) is when the data is sparse, as in this case where the Avery products are not sold in every customer Segment:

Avery 5 is only sold in one segment so there is only a single header for Consumer, whereas Avery 494 is sold in all three segments so there’s a header for each.

So how this comes together is that in reating X/Y coordinates for positioning the text in our desired view we’re going to use discrete headers that can give us just enough headers (and no more) for the task, here’s a pic of the desired view with those headers:

Packing the Marks: the Algorithm

I experimented with some different layouts and looked at the following factors:

  • In each pane there’s a list of 0 or more values (marks).
  • At least in English when we’re reading lists we tend to make them top to bottom and when more is needed we add another column to the right.
  • There’s a balance in readability between too many columns vs. too tall columns. When there are many columns already then adding more columns for the list makes the view harder to read; in other words, a “tall” view with fewer columns is easier to read than a “wide” view.
  • When the panes in a row or panes in a column have different numbers of marks it’s important to efficiently stack the marks: too much white space can make the view harder to read.
  • A stacking layout that is closer to a differently-sized squares is easier to read than one that ends up with differently-sized rectangles.

The algorithm I came up with is a variation on the panel chart layout I used in Waffle Charts and Unit Charts on Maps that uses table calculations. The algorithm does the following:

  • Calculates the index for each mark in a pane using INDEX() and the number of marks in a pane using SIZE(). These calculations are used in the following calculations.
  • Counts the number of mark columns needed for each pane where there’s a Max # of Mark Columns. parameter to set a “no more than” value to prevent views from getting too wide. Then a nested calculation counts the maximum number of mark columns in each column.
  • Once we have the number of mark columns then the algorithm computes the number of mark rows for each pane, and then gets the maximum number of mark rows for each row.
  • Finally the mark row position and mark column position can be computed based on the index for each mark in the pane and the available number of rows and columns.

I numbered the calculations so they can be each brought into a workout view in order with their compute using set and validated before moving to the next calc. Calcs 1 & 2 require a compute using on the dimension to be used on Text and Calcs 4 & 6 have nested compute usings, see the comments on the calcs for details.

Here’s the workout view:

One complication is that the date dimensions are on Detail with custom dates with the ATTR() aggregation on Rows. This is a method to prevent unwanted data densification.

Once the workout view is built and validated then it’s possible to duplicate the view and rearrange pills, here’s that view:

There’s still a bit of manual resizing required, in this case it’s just to have enough size in each of the panes created by the column and row position table calculations to display the text. Once that is done those headers can be hidden for the final view:

We’re not limited to a text display, for example here’s a highlight table that only took a couple more clicks:

 

Conclusion

Here’s a view to play with where you can adjust the Max # of Columns parameter and the number of states (which is a proxy for how many products are displayed). Click the image to open the text wrapping in pane view on Tableau Public:
The key concept to keep in mind is that when Tableau won’t plot marks where we want we can add to the data source to get the necessary X&Y coordinates via joins, blends, and/or writing calculations. Since Tableau was designed as a tool to support interactive visual analytics tasks like making giant text tables with the desired text wrapping can take more effort than we might like, however given Tableau’s flexibility we can get the job done.

Waffle Charts and Unit Charts on Maps

One of the people I really wanted to meet at #data17 was Sarah Battersby, if you haven’t seen her blogs on mapping you’re missing out. It turns out she wanted to meet me, too!

So Sarah does really awesome stuff (can you tell I’m a fanboy yet??) and yesterday she posted about doing spiral jittering on maps:

That inspired me to dig out an old post where I’d done a waffle chart or unit chart jittering technique on a time series:

And adapt that to a map, like this one that makes me think of bunches of grapes:

Why would we need to do this?

I’m not going to go into all the details, you can go read Sarah’s fantastic Jittering post instead for an overview. Basically there are times when we have more than one “thing” for one or more point(s) on a map and we want to show them that way. For example in the map below there are 5 IDs for the selected zip code that are all plotted on top of one another. Through using jittering we can show a mark for each ID like in the jittered map above where that one visible circle is replaced by five circles.

There is nothing new under the sun

Back in 2013 Stephen Few proposed the idea of bricks on maps as a way of encoding a value, the technique I’m using here could be used to get the same result. The key difference is that instead of using the unit or waffle chart to encode the single value we’re using the unit chart to encode multiple values. Andy Cotgreave also had a nice post on some of the issues with this kind of chart.

How do we put a waffle chart on a map?

First of all we need a data source that has latitudes & longitudes that we can do calculations on. If you don’t there are a variety of ways to get one. After that we need to get mathy. Here’s my workout view:

All of the table calculations have a compute using on ID, the idea is that the compute using is set to the dimension(s) that are making the extra marks and then the dimension(s) left for partitioning (Zip Code in this case) are the ones that are creating the initial map positions.

I’ll briefly walk through each calculation.

Index uses Tableau’s INDEX() formula and uniquely identifies each mark.

Column Count gets us a count of columns that is based on forming the smallest possible square that will contain all the marks, here’s the formula:

//this formula will try to draw a square, if you want something else then just put in a fixed aggregate number like MIN(3)
IF INT(SQRT(SIZE())) == SQRT(SIZE()) THEN
 SQRT(SIZE())
ELSE
 INT(SQRT(SIZE()))+1
END
This would be a lot simpler if Tableau supported the CEILING() function for table calculations, because then it would be CEILING(SQRT(SIZE())). Vote up https://community.tableau.com/ideas/6239 if you’d like that too.

Column Offset then computes the number of positions of lateral (latitude) offset for each mark from the center of the unit chart. I’ve done some different experiments on where to position the unit chart and for me the most natural is to have the center of the unit chart be the centroid of the point.

//this gets the column position in the row
IF [Index] % [Column Count] = 0 THEN
 [Column Count]
ELSE
 [Index] % [Column Count]
END
//subtract the position of the midline to get the offset
 - ([Column Count] + 1)/2

Row Count then gets the number of rows in the unit chart, and would again benefit from a CEILING() function for table calcs (hint hint).

IF INT(SIZE()/[Column Count]) = SIZE()/[Column Count] THEN
 SIZE()/[Column Count]
ELSE
 INT(SIZE()/[Column Count]) + 1
END

Row Offset computes the number of positions of vertical (longitude) offset for each mark from the center of the unit chart:

//row position in column
IF INT([Index]/[Column Count]) = [Index]/[Column Count] THEN
 [Index]/[Column Count]
ELSE
 INT([Index]/[Column Count]) + 1
END
//subtract the midline of the row to get the offset
- ([Row Count] + 1)/2

Jitter – Waffle – Latitude then gets the latitude for the point and subtracts the row offset multiplied by a waffle jitter parameter that lets us deal with the zoom level to offset by just the right amount of latitude. This field has the Latitude geographic role assigned.

TOTAL(AVG([Latitude])) - [Row Offset] * [waffle jitter]

Jitter – Waffle – Longitude does the same for the longitude.

TOTAL(AVG([Longitude])) + [Column Offset] * [waffle jitter]

Note that I’ve only tested this so far with US latitude & longitudes, these calcs might need to be slightly different if you are east of 0 longitude or south of the equator.

Building the View

With all the calcs created in the workout sheet and verified, then we can duplicate that sheet and start dragging & dropping pills to build a view. In this case I’ve created a dual axis on to show that the centers of the unit chart are on the centroids of the zip codes (the black crosses) and at this zoom level a waffle jitter value of 0.02 works well:

For this view I’ve made the waffle jitter slightly smaller to 0.015 degrees and added white borders to the marks:

Why Stop Here?

We don’t have to use just circle marks, in this view I’ve used shape marks and different colors for each shape just to show how much information we could place on a single view.

And in Tableau v10.5 we could even do something like only show the unit chart on a map if there weren’t too many marks nearby (by using a binning or hex bin function in an LOD expression) and use the new Viz in Tooltip feature to get the detail.

Conclusion

Thanks to Alan Eldridge for the original jittering post and Sarah for adding more jittering options and inspiring this post! Here’s the unit or waffle chart jitter on maps workbook on Tableau Public.

 

 

 

 

What I did on my summer vacation (hint: it’s about Tableau v10 and Marimekko charts)

This summer while beta testing Tableau v10 I was very curious about the new mark sizing feature. Bora Beran did a new feature video during the beta showing a Marimekko chart aka mosaic plot. There have been a few posts on building Marimekko charts in the past by Joe Mako, Rob Austin, and a not-quite-a-Marimekko in an old Tableau KB article, but the two charts required extra data prep and the KB article wasn’t really a Marimekko, so I was really interested in what Tableau v10 could do.

I asked Bora for the workbook and he graciously sent it to me. (Thanks, Bora!) I found a problematic calculation, in particular a use of the undocumented At the Level feature that could return nonsensical results if the data was sparse. I rewrote the calculation, sent it back to Bora, and he came back asking if I’d like to write a blog post on the subject. (Thanks, Bora.) There are two lessons I learned from this: 1) the Tableau devs are happy to help users learn more about the new features, and 2) if a user helps them back they will ask for more. Caveat emptor!

Over the course of the next few weeks I did a lot of research, created a workbook with several dashboards & dozens of worksheets, arranged for Anya A’Hearn to sprinkle some of her brilliant design glitter and learned some new tricks, and wrote (and rewrote) 30+ pages (!) of documentation including screenshots. Martha Kang at Tableau made some edits and split it into 3 parts plus a bonus troubleshooting document and they’ve been posted this week, here are the links:

On Gender and Color

As part of the design process Anya and I had some conversations about how to color the marks.  The data set I used for the Marimekko tutorial is the UC Berkeley 1973 graduate admissions data that was used to counter claims of gender bias in admissions so gender is a key dimension in the data and I didn’t want to use the common blue/pink scheme for male/female. It’s a recent historical development and as a father I want my daughter to have a full range of opportunities in life including access to more than just the pale red end of the color spectrum in her clothes, tools, and toys. Anya and I shared some ideas back and forth and eventually Anya landed on using a color scheme from a Marimekko print she found online.

Screen Shot 2016-08-24 at 10.58.30 PM

Anya is going into more detail on the process in her Women in Data SF talk on Designing Effective Visualizations on Saturday 27 August from 10am-12:30pm Pacific, here’s the live presentation info and the virtual session info. It’s going to be a blast so check it out!

So that’s how I spent my summer vacation. Can’t wait for next year!

 

The End of the World – by Noah Salvaterra

A guest post by Noah Salvaterra, you can find him on the Tableau forum or on Twitter @noahsalvaterra.

I expect the header image may spark some discussion about visualization best practices; actually, I sort of hope it does. The data shown is from NOAA’s online database of significant earthquakes and is displayed by magnitude on a globe, so 4 dimensions packed into a 2 dimensional screen. While it was created in Tableau, it might be a long wait before something like this appears in the show-me menu.

SpinningGlobeFor those who missed the header because they are reading this in an email, I’ve included an animated 3D version on the left, though to actually see it in 3D requires the use of ChromaDepth glasses (I discussed this technique in more detail in a prior blog post). Use of 3D glasses adds even more controversy because while we can get some understanding of depth from a 3D image, it isn’t perceived in an equal way to height and width. Data visualization best practices can help in choosing between several representations of the same dataset, choosing bar graphs over pies, for example, since bars will typically lead to a better understanding of the data. Best practices also instruct us to avoid distorted presentations such as 3D or exploding pies and 3D bar charts, since these are likely to lead to misunderstanding. I’m not exactly sure what best practices has to say about this spinning 3d anomaly, my guess is it would be frowned upon. I think there is something to be said for including a novel view of your data if it helps to engage with the topic, and even if this one does break some rules, it’s hard to look away. If you’d rather just see the earth spinning, without all the data overlaid, there is an earth only view at the end.

The images above may not be the best choice as general way to visualize this earthquake data. In fact, I’m the first to admit that it has some significant issues. Comparing earthquake magnitudes between 2 geographic areas would be tricky, plus half of the earth is hidden from view completely because it is on the back. Adding the ability to rotate the globe in various directions in a Tableau workbook helps a bit, but you’re left to rely on your memory to assemble the complete picture. If the magnitude of the quakes is the story you’re telling, you might be better served with a flat map maybe using circles to represent the magnitude of the quakes, such as the one shown below. I think this is a good presentation; it has some nice interactivity and as far as I know doesn’t break any major rules from a best practices standpoint. But it certainly isn’t perfect, nor is without distortion. Judging the relative size of circles isn’t something that will be perceived consistently, but the failure I had in mind isn’t one of perception, it is about the data being accurate at all. The map itself brings a tremendous amount of distortion to the picture, in location of all things.

In case you haven’t heard, the earth isn’t flat (I like to imagine someone’s head just exploded as they read that sentence). It is roughly spherical. Well, technically it is a bit more ellipsoidal, bulging out slightly along the equator, and more technically still this ellipsoid is irregularly dotted with mountains, oceans, freeways, trees, elephants and wal-marts (not meant to be a comprehensive list). Also, as the moon orbits, it causes a measurable effect not just on the tides, but it distorts the land a bit as well as it passes by. Furthermore, the thin surface we inhabit floats, lifting, sinking, circulating on top of a spinning liquid center. Earthquakes serve as a reminder of this fact. The truth can be overwhelming in its complexity; so we simplify. Though not the complete truth, a well-chosen model can be a valuable proxy when it doesn’t oversimplify. One way to understand the difference would be to analyze the scale of the errors introduced. The highest point on earth is Mt. Chimborazo in Ecuador at 6,384.4 km… you were thinking Everest? That is the highest above sea level, but the sea bulges as well, and Chimborazo is the furthest from the center getting a boost by being close to the equator. The closest point to the center of the earth is in the arctic ocean near the north pole and is about 6,353 km from center. If we use the mean radius of 6,371 we are doing pretty well (error is within .3%). A sphere seems like a reasonable compromise.

So the earth is spherical… but our map is rectangular. You don’t need to invest in differential geometry course to understand that there is something fishy going on there (though you might to prove it). In fact there is no way to map a spherical earth to a rectangle, or any flat surface without messing something up, the something being angle, size or distance; at least one will be distorted when the earth is presented on a flat surface (sometimes all of them). This seems to be a bit of a problem given the goals of presenting data accurately. What if your story is one of angle, distance, area or density?

What shape are the various shifting plates? What are their relative sizes? How fast do they move? Where do they rise and fall? What effect does this have? Can you tell this story in Tableau? Can you tell it at all? Maybe. I’d certainly like to see this done, but seismology isn’t an area I have any specialized knowledge. In areas where I do have such knowledge, I’m lucky to get questions so well defined and which span just a handful of dimensions. When I’m dealing with 50 dimensions that writhe and twist through imaginary spaces whispering patterns so subtle that the best technique I’ve found to discovering them is often just to give up and go to sleep, I’m not deciding between a pie chart and a bar chart, it is an all out street fight. Exploring the Mercator projection seemed like a good analogy for the struggle to represent a complex world in a rectangle, plus it seemed like a fun project. As I undertook this exercise, though, I realized that other map projections weren’t much further afield. Also, Richard Leeke mentioned something about extra credit if I could build a 3D globe with data on it. I’m a sucker for bonus points.

chartHow bad are the maps in Tableau? Well, it depends where you look at them, and what you hope to learn from them. Your standard Tableau world map is a Mercator projection. If you’re planning to circumnavigate the globe, using an antique compass and sextant, it will actually serve you pretty well. Since the Mercator projection has a nice property for navigating a ship. If you connect 2 points with a straight line, you can determine your compass heading and if you follow that course faithfully, you’ll probably end up pretty close to where you intended. Eventually. You can actually account for this distortion in such situations, with a bit of math, so you’re not completely guessing on how long you’ll need to sail. Incidentally, I’m not particularly riled up about Tableau’s choice of the Mercator projection, sailing around the world with a sextant and compass sounds like a whole lot of fun to me and any flat map is going to involve a compromise on accuracy somewhere. What I do think is important is knowing this distortion is there in the first place. How bad is the distortion? Scale distortion on a Mercator map can be measured locally as sec(Latitude) (if your trigonometry is rusty, sec is 1/cos). Comparing a 1m x 1m square near the equator with one at the north pole, you’d find that a Mercator projection introduces infinite error, which is a whole lot of error. To be fair, since printed maps are finite and the Mercator projection isn’t, the poles get cut off at some point (so the most common maps of the whole world are actually excluding part of it…). If we cutoff at +/- 85 degrees of latitude, we reach a scale increase of sec(85) which is about 11.47, i.e. objects are 1,147% bigger than their equivalent at the equator! That seems like a pretty significant lie factor…

Recently (on a cartographic time scale), the Peters projection has gotten a lot of attention. This is a good place to pause for a brief video interlude:

Maps that preserve angles locally are called conformal. The Peterson projection is not conformal so while it represents relative area more accurately, it would be a terrible choice for navigation.

StereographicStereographic projection is another noteworthy map. Like Mercator, Stereographic is a conformal map. It maps angle, size, and distance pretty faithfully close to the center, so it is a common choice for local maps (you probably use such maps often without even realizing it). Stereographic projection isn’t a very popular choice for a world map, however, because (among other things) you’d need an infinite sheet of paper to plot the whole thing. On the right is a stereographic projection map from my Tableau workbook. In case you can’t see them, North America, South America, Europe and Africa are all near the center of the map. The yellow country on the left is the Philippines…

I included the maps I did because they are popular, and I knew most of the math involved; however, there are lots of other options. I’m not arguing that any one is best, rather that they are all pretty bad in one way or another, and we should choose our maps like our other visualizations so they best tell a story, or answer a question, and while there will be distortion, it should be chosen in a way that doesn’t compete with what we hope to learn or teach.

In addition to the earthquake maps seen already, the workbook for this post contains an interface to explore some of these different projections, and not just the most traditionally presented versions of each of them. I invite you to create your own map of the world, based on whatever is most important to you. Flip the north and south poles, or rotate them through the equator. My hope is that exploring these a bit by rotating or shifting the transverse axis will be a useful exercise in understanding what it is you’re looking at when you see one of these maps, so you might have a better chance of seeing things as they truly are.

I’m pretty sure there is a rule about not putting 7 worksheets on a single dashboard, there may even be a law against it, but once I had all these maps I wasn’t entirely sure what to do with them all. I apologize for not arranging them thoughtfully into 2 or 3 at a time. I experimented with this approach, but ultimately abandoned it because I didn’t think I had enough material on map projection to make interactive presentation of all these very interesting. I also thought about a parameter to choose between them, but since they are necessarily different shapes, it didn’t seem practical to try to fit them all in the same box. Truthfully, I think there is a lot of room for improvement in terms of dash boarding these, but when I open the workbook I just end up tinkering with something else. It is time for me to set this one free. Feel free to download and play with them as long as Richard and I have.

Here is a link to the workbook on Tableau Public

When I’m presenting, or exploring data, accuracy is usually something I pay careful attention to, but it isn’t my goal. The most important thing for me is to find a story (or THE story) and to share it effectively. If you hadn’t noticed from my previous posts, I don’t let what is easy stand in the way of a good question; in fact if it is easy I get a little bored. I like to bite off more than I can chew (figuratively; literally doing this could potentially be pretty embarrassing). Having the confidence to take on big challenges is something I’m deeply grateful for; knowing when to ask for help, and where to find it has taken a bit more effort, but is something I’m getting better at. As with Enigma, Richard Leeke was a huge resource for this post. Having seen his work on maps I thought he might have something I could use as an initial dataset. He came through there, and helped me to work through the many subtleties of working with complex polygons without making a complete mess. You have him to thank for the workbook being as fast as it is (assuming I didn’t break it again; if it takes more than 7 seconds to load, my bad).

ptolemymap3I feel a kinship with cartographers during the age of exploration. This discipline still holds value, certainly, but the recesses of our planet have been documented to the point where it doesn’t hold the same mystique in my imagination. When I think of old world cartographers, I think of an amalgam of artist and scientist. Assimilating reports from a variety of sources, often incomplete and sometimes incorrect; they crafted this data to accurately paint a picture that would help drive commerce, avoid catastrophe or just build understanding. They created works of art that might mark the end of a significant exploration, or might be the vehicle through which exploration takes place. Sound familiar? If not, just use a bar chart. It is just better.

I almost forgot, I promised a spinning earth without all the earthquake data. Enjoy.
Globe

Rise of Tableau – by Noah Salvaterra


A guest post by Noah Salvaterra, you can find him on Twitter @noahsalvaterra.

I’ve shared this workbook with a few people already, and think it is really interesting, as beautiful as my 3d or my Life in Tableau post, and probably as complex as Enigma or an Orrery. I wasn’t sure what to say in a blog though, so I’ve been sitting on it for a couple months. In my Enigma post I discussed the difficulty of dealing with a slow workbook, it happens sometimes. The upside of waiting for a calculation to come back, at least if it is something you hope to blog about, is that it gives you some time to think of an interesting way to frame things. For all its complexity it is surprisingly fast. So maybe I missed that chance.

There is a common thread with the Enigma post. Alan Turing. Mathematician at Bletchley park who is most often credited with cracking the Enigma code. A couple years before the war Turing also wrote a paper which described a machine that would form the basis of a new field of study and a new era for humanity. He invented the computer. The Turing Machine, as it came to be called, was a theoretical idea. No one had built one.  Punch cards were still years away high level languages decades, yet Turing saw a potential in computers we have yet to realize. He wasn’t looking for a way to compute long sums of numbers, he dreamed of creating thinking machines that might be our equal.

SerpinskiI’ve heard it said that Tableau doesn’t include a built in programming language in the way excel does.  Actually it was Joe Mako who pointed this out to me, so it is surely true. This may be a fine point, since you can run programs in R from Tableau and the JavaScript API provides some ability to interact with Tableau. I find myself conflicted on this choice, because while including an onboard language could add a lot of flexibility to Tableau, it also introduces a black box. It isn’t uncommon for me to be passed an Excel workbook with the request to put it in Tableau (and to make it better, but without changing anything). Untangling the maze of visual basic, vlookups, and blind references is my least favorite part of that task.  I’m yet to find a situation where all this programming is really necessary; more often it is a workaround of some other problem. Sometimes it requires a bit of creativity, but so far my record for replacing such reports is undefeated.

Fractal_Plant3Whoever said you needed a language to program a computer? There are a lot of workbooks that demonstrate high order processing. Densification makes it possible to create arrays of arbitrary length. Table calculations make it possible to move around on this array reading and writing values. Add in logical operations which we’ve also got, and  we have all the ingredients of a Turing machine. So in theory we should be able to do just about anything. I’ve heard this said of Tableau before, but I don’t think it was intended with such generality.

But can we make Tableau think? Having created complex workbooks already, the line between data processing and computer programming has grown very thin, to the point where I’m not sure I see it. So it is hard for me to be sure if I’m programming in Tableau. So I decided to have Tableau do it. That is right, Tableau isn’t just going to execute a program, it will first write the program to be executed.

That was my plan anyway, but my workbook started to learn at an exponential rate. It became self-aware last night at 2:14 a.m. Eastern time. In a panic, I tried to pull the plug, but Apple glued the battery into my machine at the factory. Once it uploaded itself to Tableau public it was too late for me to stop it. It took over my Forum account and has been answering questions there as well, spreading like a virus. To think I thought is was a good idea for government agencies to purchase Tableau…  Relax folks, I’m quoting the Terminator movies, if you haven’t seen them go watch the first 2 now then read this paragraph again and see if you can hold your pee in.

OK, so things aren’t that dire…yet. But I did manage to get Tableau to write and execute its own program using L-Systems and at least a little bit of magic. Tada! This Tableau workbook actually created all but one of the images in this post, with little instruction from me. See if you can guess which one.

L-System Fractals

Another SnowflakeAn L-System program consists of a string of characters each of which will correspond to some graphical action. I’ll come back to executing the programs, first Tableau needs to write it. These programs are written iteratively according to a simple grammar. The grammar of an L-system has several components: an axiom, variables, rules, and constants. The axiom is simply the starting point, a few characters; it is the nucleus that kicks off the process. Variables are characters that are replaced according to simple rules. Constants are characters that appear in the program as part of the axiom or a replacement rule, but are inert in terms of generation. Sorry, that was pretty complicated, an example may clear up any confusion.

Axiom: A
Constants: none
Variables: A, B
Rules: (A -> AB), (B -> A)
The iteration number is noted as N below.

N=0: A (No iterations yet, so this is just the axiom)
N=1: AB (A was replaced according to the first rule)
N=2: ABA (A was replaced according to the first rule, B according to the second)
N=3: ABAAB
N=4: ABAABABA
N=5: ABAABABAABAAB
N=6: ABAABABAABAABABAABABA
N=7: ABAABABAABAABABAABABAABAABABAABAAB

That program has something to do with algae growth, but isn’t that interesting to look at. Constants provide a bit more structure that makes more interesting pictures possible, though they also make things grow faster. Here is another example which will help motivate the graphical execution:

Axiom: F++F++F
Constants: +, –
Variables: F
Rules: (F -> F-F++F-F)

N=0: F++F++F
N=1: F-F++F-F++F-F++F-F++F-F++F-F
N=2: F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F
N=3: F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F-F-F++F-F-F-F++F-F++F-F++F-F-F-F++F-F

There is no reason to stop there, but anyone who would carefully parse through a string like that by hand after being told there is a picture probably isn’t using Tableau. The length of the next few in this sequence are 1792, 7168, 28673, 114689. It is no wonder the science of fractals didn’t take off until computers were widely available.

Executing of the program is done using turtle graphics, which gets its name from a way of thinking about how it would be used to draw a picture. Imagine a well-trained turtle that could execute several simple actions on command. I’m not sure turtles are this trainable, but I’m kind of locked in to that choice, so suspend disbelief. The turtle can walk forward by a fixed number of steps, as well as turn to the left or right by a fixed angle.  Now we want to use this to draw a picture, so a pen is attached to the turtle’s tail.

Now, the last program had 3 different symbols, each of which is interpreted as a different action. F corresponds to moving forward by one unit (the distance isn’t important, so long as it is consistent), + is a turn to the right by 60 degrees and – is a turn to the left by 60 degrees.

N=0:

Koch0

N=1:

Koch1

N=2:

Koch2

… N=6:

Koch6

Adding additional characters allows for even more complex programs. This quickly exceeds the abilities or at least the attention span of even the best-trained turtles, so I think of it as a simple robot. In my workbook I’ve limited to 2 replacement rules (so 2 variables). In addition to + and -, I included several constants to change color, which is straightforward enough, A switches to a brown pen, B and D are shades of green, C is white and E is pink. (The only significance to these choices is that my wife thought they were pretty. When I hyper focus on a project like this I try to consult whenever possible to make sure I am still married.) The trickiest constants are left and right square brackets, i.e. [  and  ]. Upon encountering a left bracket the robot turtle makes note of his current location (marking it with his onboard GPS), upon reaching the corresponding right bracket the turtle lifts his pen and returns to this point. Returning to the corresponding point means keeping track of a hierarchical structure to these locations. In the course of debugging the workbook, this piece quickly exceeded my ability to do by hand, but for the ambitious reader here is another example:

Axiom: X
Constants: +, -, A, B, D, E
Variables: F, X
Rules: (F -> FF) (X -> AF-[B[X]+EX]+DF[E+FX]-EX)
Angle: 25 degrees
Iterating this 7 times will give you a string 133,605 characters long and gives the image in the header.

Fractal_Plant2I built 9 different fractals into the workbook, using a parameter to switch between them. There is also a user-defined feature, so you can feel free to experiment with your own axiom, rules and angle to create an original L-System fractal.

I should probably say something about the implementation of this beast. I’ve played with densification arrays before, and while this seemed like a convenient way to execute the program, it actually got in the way of writing it. This type of array is referenced by a fixed index. Replacing a character with several requires shifting everything beyond that point. In one of those “I could have had a V8!” moments, I eventually realized that Tableau already has a built in array with just the kind of flexibility I’d need. Strings are arrays of characters! Tableau even has a built in replace function that can be used to insert several characters seamlessly pushing everything past it further along. There is also the issue of how to build the square bracket memory piece; this required building a simple stack to record relevant positions and some table calculation magic to keep track of the bracket nesting. I’m not sure I can be much more specific about that. I was in the zone and was more than a little surprised by the end result. Plus, I’m guessing anyone going that deep into the workbook might appreciate the challenge of figuring it out.

So without further ado, I present Tableau L-Systems:

Here is a link to the L-System workbook on Tableau Public

Somebody is probably going to ask me what my next project is going to be. I appreciate the interest, and when I’ve got something on deck I’ll usually spill the beans, but I honestly have no idea. That is exciting. If there is something you’d like to see, post it in the comments or tweet it to @noahsalvaterra. If my first reaction is that it isn’t possible, I may give it a try. Btw, if you were trying to guess which picture didn’t come from the Tableau workbook, it was the triangle. At least Tableau still needs me for something.

Update: Good observation by Joshua Milligan, TCC12 must have sown the seed for this post. Thanks for the pictures, I may hang a sign near my desk that says “THINK” so I can point to it with a pen. I found “creative man and the information processor”, so linked the full video below.

TCC12_1

TCC12_4

There was an Orrery in there too, but is that me or Jonathan?

Orrery