Dynamically Group “Long Tail” Data in Tableau
Recently I received a request on what to do with a “long tail” of data in Tableau. For example, show me the top 5 or top 10 or top 20 and then group all the remaining small values into one category called “Other.” A group in Tableau can easily be created to accomplish this but it’s not dynamic. It requires you as the analyst to know the top 5 or whatever is being requested, and we assume those top 5 won’t change. This is subpar. We want this to be dynamic.
The way to accomplish this is by creating a Set, a Parameter, and a Calculation and using them together.
Look at the following example. We’ve got a data set of people who threatened police officers with weapons and were killed as a result. Police officers acting in perceived self defense. We have all the police-caused fatalities and the weapons the assailants (and subsequent victims) were using.
Naturally our eyes will look at the top 5 or so, but we intuitively do not assign much significance to the lower distributions of data. I can group the top 6 together, but if, for example, there is a rise in next years attacks with the weapon of choice being “scissors”, my top 6 group just wouldn’t cut it anymore.
We want to see the top X, with the ability to be driven by the user, and then group the rest of the weapons – i.e. the “long tail” of data – into an Others category.
We want the end result to look like this:
Here’s how we accomplished this:
- Right click on “Weapons Used” and select “Create” -> “Set” in the pop-up menu.
- Select the Top tab and choose the “By Field” radio button.
- Rather than hardcode a number, use the drop down menu to create a set to show the “Top N Weapons”. Entitle the set “Top Weapons”.
- You will see the new set appear on the left hand side under sets. Go ahead and right-click the set and select “Create Parameter.”
- Name the parameter “Top N Weapons”
- Change the data type to “Integer” and set the current value to 5.
- Make our range of values spanning from 1 to 25 and click OK.
- We are going to combine all of these together now into a calculated field by right clicking on a blank spot on the left pane.
- Our formula is going to be ‘if [Top Weapons] = TRUE then [Armed] else “Others” end’ and name it “Most Used Weapons.
- Drag our newly created calculated field to rows and “Number of Records” to Columns. We now have the chart we want with our long tail data dynamically grouped by a user selected “Top N” selector.
Here are some other uses for dynamic sets:
- Look at your top marketing channels and group everything else beyond the top.
- Maybe you have a large number of products you sell, but only have a few high revenue makers. Group the rest to compare the little items to your big hitters.
- Compare top referring domains or sources and group the long tail.
- Look at the top customer complaint reasons and group the one-ofs.
- Hospital re-admissions reasons and combining the tail.
- Why is student enrollment down at our university?
What have you used or wanted to use dynamic grouping for? Let me know in the comments below.