Row level security often presents challenges, if your security data is not in the same data source as your main data. In such cases, solutions tend to leverage one of four techniques:
- Data prep outside of Tableau – often involves cross products that can make data sets unacceptably large.
- Data blending — just about the only in-Tableau option prior to Tableau 10.0.
- Cross data source joins — available in Tableau as of Tableau 10.0.
This post introduces a 5th option using a cross data source filter. see below for how to set it up!
Both data blending and cross data source joins employ a federated join — that is, Tableau loads the data from both data sources into temp tables and then performs the join using its own data engine. The problem with federated joins is that the data is fetched before the all filters are applied to the join conditions. This can result in many, many rows being fetched from the main data, only to be discarded because the current user does not have access to them. The performance impact can be substantial. Therefore if we could somehow avoid the need for a federated join, row level security would be much more efficient.
In Tableau 10.0, along with cross data source joins, we now have the ability to configure cross data source filters. Cross data source filters are a powerful new tool, because they allow filters to be applied across data sources without the need for federated joins. By configuring data relationships between two data sources, a filter in one data source can be applied to the counterpart field in another data source. Even though the data relationships feature is used to define “counterpart fields”, cross data source filters do not actually employ data blending.
It turns out that cross data source filters can be leveraged for efficient row level security, due to an obscure behavior of the feature. Let’s look at how it works.
Setting up the Cross Data Source Filter Solution
In this Superstore example, let’s say we have a security model that gives users access to the data for specific states. The model looks like this. (“John Doe” will be my pretend username for this exercise. Substitute with your Tableau username if you’re following along.)
To apply this security model to Superstore, create a workbook with both data sources loaded separately.
In the Security data source, configure a data source filter to exclude everything not relevant to the current user.
Now create a blank sheet in the workbook. Using the Security data source as the primary, bring the security field (“State” in this example) onto the Filters shelf. Configure it to filter on “All”.
Finally, configure the filter to apply to “All Using Related Data Source”. The filter is now a cross data source filter. Note the unique icon next to the field name.
This sheet only exists to set up our cross data source filter. Before we move on to the main view, edit the Data Relationships and make sure that State in the Security data source is related to State in the Superstore data source.
Now create a new sheet. This will be our main view. Using Superstore as the main data source, drag State onto the Rows shelf, and Sales onto the columns shelf.
As soon as we started to build a view with the Superstore data source, the State filter was applied from the Security data source, because it is a cross data source filter. Also, notice that the only states shown in the view are the ones that “John Doe” has access to!
The Secret of Cross Data Source Filters
At first this might not make a lot of sense. The State filter that we set up was configured to allow “All”. So why does it limit the states in our view? It turns out that there is a subtle nuance to the behavior of a cross data source filter:
The list of available values (the domain) comes from the data source used for the filter field.
What does this mean, exactly? Basically, if we filter State in the Superstore data source by using State from the Security data source as a cross data source filter, Tableau immediately assumes that Superstore’s State should only include values that are present in Security’s State, and therefore filter out any values that aren’t. Even if the cross data source filter is configured to allow “All”, Tableau will still make sure that the only values of State in Superstore are those that match values present in Security (the domain of the State field).
There’s a rationale to this, I think. If you turned on “Show Filter” on the cross data source filter, it would only show you values from the Security data source. It would be awfully confusing if, when you select “All” in the quick filter, suddenly Superstore shows states that aren’t even listed in the quick filter. So, I can see why Tableau limits the domain of Superstore’s State to be the same as Security’s State.
This works in our favor. By applying the USERNAME() check as a data source filter on the Security data source, we essentially limited the domain of Security to only those states that the user should have access to. That domain, in turn, implicitly filtered the Superstore data. And it did so without using a (relatively slower) federated join.
Row Level Security Involving Multiple Fields
This technique can be extended to more than one field, depending on how you want the fields to be evaluated. We’ll use the above example for the following exercises.
AND Relationship (“User can see data that matches Field 1 and Field 2″)
In this scenario, you will need to create a calculated field that merges the fields into a single “hash value”. This calculated field should be built in both data sources (don’t forget to relate them in the Data Relationships) and then used as the cross data source filter.
OR Relationship (“User can see data that matches Field 1 or Field 2″)
In this scenario, you can simply configure both fields in the Data Relationships, then apply them as separate cross data source filters. WARNING: This does not allow for wildcard matches. Both fields in Superstore must still contain valid data, and each one must match something in the Security data source. See next scenario.
If you require a wildcard match — for example where one or more fields in a security row are left blank/null to indicate that any possible value or (All) is acceptable — this technique will not work for you. Cross data source filters apply against specific values; a wildcard is equivalent to “match this field unless it’s NULL, then anything matches”. To enable wildcard match behavior, you must use a cross data source join with a specialized, calculated join criteria.
Prior to Tableau 10.1, you must employ a cross data source join and apply the calculation as a cross data source filter. This negates the advantage of the cross data source filter technique. If practical, consider refactoring your security model to avoid the use of wildcard matches. (In our Superstore example, for instance, it is preferable to create a row with every possible State rather than a single row that is meant to convey “all states”).
As of Tableau 10.2 you can specify ad hoc calculations as join criteria and build out a view. To see an example check out Yuriy Fal’s post here.
Things to Watch Out For
In some circumstances, Level of Detail (LOD) Expressions will generate an error in Tableau.
This issue seems to be dependent on the type of data source connector that is used. Text files (Excel and CSV) and TDE extracts don’t have any problem with LOD Expressions; Vertica and Snowflake have also shown no issues at the time of this writing. On the other hand, Oracle and Exasol connectors seem to have problems using LOD Expressions in conjunction with cross data source filters. The error seems to occur regardless of the type of LOD Expression used, or the fields that are included/excluded.
Finally, keep in mind that this behavior may not have been “by design”. Although there appears to be a rationale for limiting the domain of the main data source to match the domain of the cross data source filter, the issue with LOD Expressions suggests that some aspects of Tableau’s behavior may not have been the result of explicit design decisions. Consequently, the behavior of cross data source filters could change in future versions, and this technique may not continue to behave in a manner that supports row level security.
Using cross data source filters to implement row level security can substantially increase the performance of your workbooks, because they function without using federated joins. If your security model supports the forms of filtering that are possible with this technique, and your data sources do not have problems with any LOD Expressions that your views employ, this is an elegant alternative to data blending or cross data source joins.
A round of thanks goes to everyone who contributed insights about this technique when it was first mentioned on the Tableau Community forum. And a special thanks to Jonathan Drummey for hosting this presentation of the technique on his blog.
Happy viz-ing! –Jamieson