Dynamically Group “Long Tail” Data in Tableau

Dynamically Group “Long Tail” Data in Tableau

Recently I received a request on what to do with a “long tail” of data in Tableau.  For example, show me the top 5 or top 10 or top 20 and then group all the remaining small values into one category called “Other.”   A group in Tableau can easily be created to accomplish this but it’s not dynamic.  It requires you as the analyst to know the top 5 or whatever is being requested, and we assume those top 5 won’t change.  This is subpar.  We want this to be dynamic.

The way to accomplish this is by creating a Set, a Parameter, and a Calculation and using them together.

Look at the following example.  We’ve got a data set of people who threatened police officers with weapons and were killed as a result.  Police officers acting in perceived self defense.  We have all the police-caused fatalities and the weapons the assailants (and subsequent victims) were using.

Naturally our eyes will look at the top 5 or so, but we intuitively do not assign much significance to the lower distributions of data. I can group the top 6 together, but if, for example, there is a rise in next years attacks with the weapon of choice being “scissors”, my top 6 group just wouldn’t cut it anymore.

We want to see the top X, with the ability to be driven by the user, and then group the rest of the weapons – i.e. the “long tail” of data – into an Others category.

We want the end result to look like this:

Here’s how we accomplished this:

  • Right click on “Weapons Used” and select “Create” -> “Set” in the pop-up menu.

  • Select the Top tab and choose the “By Field” radio button.
  • Rather than hardcode a number, use the drop down menu to create a set to show the “Top N Weapons”. Entitle the set “Top Weapons”.

  • You will see the new set appear on the left hand side under sets. Go ahead and right-click the set and select “Create Parameter.”
  • Name the parameter “Top N Weapons”
  • Change the data type to “Integer” and set the current value to 5.
  • Make our range of values spanning from 1 to 25 and click OK.

  • We are going to combine all of these together now into a calculated field by right clicking on a blank spot on the left pane.
  • Our formula is going to be ‘if [Top Weapons] = TRUE then [Armed] else “Others” end’ and name it “Most Used Weapons.

  • Drag our newly created calculated field to rows and “Number of Records” to Columns. We now have the chart we want with our long tail data dynamically grouped by a user selected “Top N” selector.

Here are some other uses for dynamic sets:

  • Look at your top marketing channels and group everything else beyond the top.
  • Maybe you have a large number of products you sell, but only have a few high revenue makers. Group the rest to compare the little items to your big hitters.
  • Compare top referring domains or sources and group the long tail.
  • Look at the top customer complaint reasons and group the one-ofs.
  • Hospital re-admissions reasons and combining the tail.
  • Why is student enrollment down at our university?

What have you used or wanted to use dynamic grouping for? Let me know in the comments below.