At the Level – Unlocking the Mystery Part 2: Rank Functions

Many moons ago I did a first post exploring the non-obvious logic of the most secretive of Tableau table calculation configuration options: At the Level. A few weeks ago I was inspired by a question over email to dive back in, this post explores At the Level for the five rank functions: RANK(), RANK_DENSE(), RANK_MODIFIED(), RANK_UNIQUE(), and RANK_PERCENTILE(). The rank functions add a level of indirection to the already complicated behavior of At the Level and I don’t have any particular use cases, so…

If you are like me and won’t rest until you understand every detail of Tableau’s functionality, then this post is for you. Otherwise you may find this post unhelpful and/or confusing due to extreme table calculation geekery. You have been warned.

The particular challenge with ordinal functions like INDEX(), FIRST(), and the rank functions is that we absolutely have to understand how addressing and partitioning works in Tableau, and then we tack onto that an understanding of how the calculations work, and finally we can add on how At the Level works. For the first part, I suggest you read the Part 1 post on At the Level, it goes into some detail on addressing and partitioning. To understand the rank functions here’s the Tableau manual for table calculations (scroll down to the Rank functions section). Finally, read on for how At the Level works for rank functions.

Counting from Nothing – A Double Remix (or, Partitioning via Table Calculations v2)

Over on the Tableau forums Alexander Mou answered a thread on generating a count from sparse data, and the solution he came up with is found in his blog post Dynamic Histogram Over Time. In this post I’m diving into some details of what Alexander did, coming up with a couple of alternative remixes of that solution, and describing a couple of different ways to effectively partition a table calculation via another table calculation. Read on for details!

At the Level – Unlocking the Mystery Part 1: Ordinal Calcs

There was a Tableau forums thread on At the Level awhile back where Matthew Lutton asked for an alternative explanation of this somewhat puzzling table calculation configuration option, and I’d promised I’d take a swing at it. Plus, I’ve been deep into book writing about shaping data for Tableau, and a taking a break to write about obscure table calc options sounds like fun! (Yes, I’m wired differently.)

Read on for a refresher on addressing and partitioning and my current understanding of uses of At the Level for ordinal table calculations such as INDEX() and SIZE(). Part 2 will cover LOOKUP(), and Part 3 will cover WINDOW_SUM(), RUNNING_SUM(), and R scripts. If you’re new to table calcs, read through at least the Beginning set of links in  Want to Learn Table Calculations. Thanks to Alex Kerin, Richard Leeke, Dimitri Blyumin, Joe Mako, and Ross Bunker for their Tableau forum posts that have informed what you’re about to read.

Orrery Remix – by Noah Salvaterra

When I first saw Jonathan Drummey’s Orrery post I sat there staring at it for a few minutes. Wow, right? So when the idea occurred to me to remix this classic I spent a few days wondering if I was serious. Once I managed to push my mouth closed I wondered if Tableau was the right tool for this job. Tableau, I thought, is just a tool for working with and visualizing existing data.

The Orrery is computation driven; the position of the planets are computed on the fly in Tableau and the data source driving this piece of art is almost an afterthought. It could literally have been built on Superstore Sales, you just need a place to hang the results of computations. In the case of Jonathan’s Orrery, this structure comes in the form of a SQL query that generates a hugely redundant data set. In all it has 51,840 rows. There are 18 planets (counting the sun and various moons), 360 ticks for the frames in the animation and 8 wedges for the slices that make up the pie charts used to draw each planet showing day and night (best use of pie charts ever!), 18 x 360 x 8 = 51,840.

I’m two blog posts deep into domain padding and using Tableau as a drawing engine, so I was reasonably sure the Orrery could be replicated with fewer rows in the source data, and that doing so could add flexibility in the number of ticks and the number of pie wedges. Improving the structure slightly didn’t seem significant enough on its own. I decided to give it a go if only for practice, and as an excuse to dive into understanding the original, but set two requirements that would need to be met prior to declaring it blog-worthy: First, Jonathan would have to be onboard, which he was. Second, it would need to bring something new to the table. I think I succeeded in finding an interesting point of view, but it isn’t exactly a new one…

I’m going to put the earth back in the middle!

My Orrery is built on two datasets. One that holds a single copy of the solar system data, and another for the intricate clockwork mechanism that makes it all tick. I connect these via data blending in Tableau. This blend could easily have been avoided, but it seemed like a more natural way to find the data. I figure this setup would make it easy to apply to other solar systems without duplicating effort. The planet data has 18 rows, taken from any of the 2,880 identical copies in Jonathan’s workbook (I used wedge 1 of tick 1, not that it matters), the mechanical structure has 72 rows. Since blending is a left join, I’d call this 72 rows total, but if you say 90, I guess I wouldn’t argue.

Now the domain padding isn’t that exciting. Well, sometimes it is, but I discussed it already in Creating data, multi-step recurrence relations, fractals and 3D imaging… without leaving Tableau. Nobody asked any questions on that yet, so I’ll assume it was all totally straightforward.

Since recreating the Orrery I’ve discussed my approach with Jonathan, and learned he considered using domain padding for the original, but ruled it out because of performance issues. Tableau 8.0 and 8.1 included some performance improvements for Table Calculations, so it is likely this approach only recently became feasible. Both techniques have strengths and weaknesses, it isn’t clear to me that one is better than the other.

There is one small point I can claim as a genuine improvement, that is changing the Days/Tick parameter from an integer to a float. Large values are needed to get satisfying movement from the slow moving outer planets, but some of the fast moving moons require less than one earth day per frame to see a smooth orbit.

Before I get any angry emails, I should mention that the planets don’t travel in perfect circles (or even ellipses), and that the solar system is also moving around the center of the galaxy and that it is all expanding from some infinitely dense singularity, which exploded for some reason. I’m not a physicist. This is more of a classroom model, a shiny one made mostly of brass. My interest is more on perspective than perfect accuracy. If I had a beta of Tableau 8.2, I might have shot for ellipses; in Windows, meh… close enough.

Putting the earth in the middle is something I hadn’t seen a computer animation of. I knew the math could get complicated, which is why Copernicus and so many after preferred centering on the sun, but Tableau actually makes it pretty easy to change perspective. In fact, why just look at an earth centered solar system? What would it look like if we centered it on Venus or Mars? It turns out it is a lot crazier than I had imagined! Another surprise (though in retrospect it shouldn’t have been) is that from the perspective of slow moving outer planets the solar system model looks very similar to the sun centered view. I suppose this might develop over a longer time span than any I tried.

I started by taking perspective of standard heliocentric model, which is simplest math, both empirically and because I had Jonathan’s Orrery for reference. I then used a lookup table calculation so that the heliocentric position of the earth was visible to each of the other planets. Then, in what amounts to vector addition, I offset each planet by exactly the negative of the position of the earth. This puts the earth at the origin, and since an equal distance displaces each planet, the relative position is exactly the same in each frame. The history, on the other hand, traces out a very different path. A heuristic way to think of this, in terms of the clockwork orrery, is to pick the whole thing up by the earth, or some other planet, and holding that planet fixed while the mechanism, base and all continue spinning away as normal. Please don’t attempt this with your antique orrery, you will break it.

Trying to see things from two perspectives at the same time reminds me of optical illusions where you can see two possible pictures. With practice I can usually switch between perspectives quickly, but I can’t seem to see the old lady and the young lady at the same time. Why is that?

Since the trails give a historical record that is based on the center of the view, I thought it might be helpful to provide some landmark to compare it to another perspective. To accomplish this, I repeated this process of computing the position of a chosen planet to build a window that would dynamically remain centered. My original intent was that this window would follow the now moving sun, though again I found it interesting to approach with more generality, so the window can be centered on any planet or moon in the view. In any frame of the animation, the position of planets within these boxes is identical between the two models. Here is a video of simultaneous geocentric and heliocentric models:

I considered taking a shocking stand and arguing that the earth really is the center of the solar system, but the reality is that velocity is relative; for speed you need a fixed frame of reference. Imagine waking up on a train and seeing another train out the window that is travelling in the opposite direction. Both trains may be moving in opposite directions at 30 miles per hour, or just one train moving 60 miles per hour with the other standing stationary (or infinitely many other possibilities). To be absolutely certain of what is happening, the best approach is to look out of the window on other side. A stationary platform will clarify that your train is at rest and you’re about to miss your stop. When riding on a train, the earth is a fixed frame of reference from which we can discuss motion in an absolute context. The context determines the frame of reference, and generally this choice is so obvious you wouldn’t even realize you’re making it. When you consider planets, however, there is no handy universally agreeable point from which to gauge motion (at least not since Copernicus).

Often, regarding earth as a fixed point is preferable. Imagine trying to build a house or navigate to work while accounting for the spinning orbit of a planet that travels around the sun at 67,000 miles per hour. Most of us are unlikely to get further off this planet than an airplane ride, and planetary motion isn’t particularly relevant at that scale. Even taking a trip to the moon, my guess is that an accurate model of the solar system would be tremendous overkill. Any trip we take from our planet will begin with the same velocity and direction as the earth, and be subjected to the same gravitational forces for some time afterwards, so why does it even matter? Well, I think there is a deeper lesson in this story.

There is a famous quote of George Box, “Essentially, every model is wrong, but some are useful.” Finding out fundamental beliefs are wrong can be upsetting, or at least disorienting. Looking at the Venus centered solar system gives me some idea what it must have felt like for the first who pondered centering things on the sun. It also makes it easier to understand those who thought the idea was laughable.

Mathematical models are an important tool in science, including data science, but most models aren’t regarded as “the truth” until they are taken well out of their original context. I believe that as humans we are wired to search for the truths underlying our universe. I am anyway and there seems to be a strong historical case. But accepting mathematical models as such isn’t necessarily truth-ier than what came before. I do think the decrease in stake burnings is progress though.

The lesson here is one of complexity. I’m as guilty as the next guy of adding complexity to achieve a desired result, maybe more so. Something I’ve learned several times across several disciplines is that when you are adding layers to keep things working it is a good time to step back and check your assumptions. Changing your paradigm may result in something simpler, and while simplicity and truth are different things, they are both worth chasing.

So here is Orrery 2.0. Dynamic animation doesn’t work in Tableau public (currently) so please download for the full experience. Have fun experimenting with different centers and see some of the crazy orbits that evolve. Hopefully someone will enjoy this half as much as I did the original and a seed will be planted to push this even further. I can’t wait: