What is Clustering and Why is it Useful?
I’ve been an avid user of Tableau for years. I pretty much can’t solve a quantitative problem these days without using Tableau to help me visually explore my data and iterate through ideas and hypotheses. But, some problems require more heavy lifting than a viz can handle simply.
Tableau has recently begun adding more statistical tools that provide powerful ways of visualizing and exploring data. Clustering is one of the newest features in Tableau 10, and puts advanced statistics into your hands with just a few clicks.
Clustering allows you to easily identify statistically similar groups. In plain English, based on attributes you tell Tableau, it will go through and determine similarities and create look-a-like groups. You can then drill into those for more detail or compare how each group behaves relative to each other.
Getting better insights faster enables us to take more action. Being able to take action that makes an impact makes you a hero; it makes you the person with all the answers. That’s an awesome feeling and it’s what Tableau enables us to achieve. The ability to find hidden insights with Tableau’s easy drag and drop functionality is a major step in getting to action faster.
Here are some example vizzes of how people have used clustering to create segments and find insights they couldn’t get to easily:
Marketing pro Chris Penn used the clustering tool to find insights about his own blog that were obscured with traditional methods of visualization. Namely, drilling into what topics of social media posts drove new users, large number of reshares, or were stagnant:
Chris Wood gives an insightful interactive analysis of at risk youth in the Washington, D.C. school district, also explaining how he used clustering to do so.
Here are other potential use cases for clustering:
Customer Segmentation: Say you have a group of customers that logs in very infrequently, never calls support, started with low monthly recurring revenue, but spent tons on upgrades over time. That’s an odd group with tremendous organic growth and low costs, even though initial revenues were low. Clustering can find groups like this.
Market research: How do we determine different groups in the market and create products and marketing messages that resonate with those people? For example, a bank found a group of entrepreneurs that was using equity from their homes via a 2nd mortgage to fund their startups. Knowing that led to a whole new line of products for the bank that resonated much stronger with that group.
Customer Surveys: What clusters crop up among satisfied customers, what clusters crop up among unsatisfied customers? Are the unsatisfied customers also utilizing your excellent support services?
Matching or Recommendation Algorithms, like Netflix: For example, based on movies that have a Strong Female Protagonist, Witty Humor, and British Actors, we recommend all movies based on every Jane Austen book ever.
Telecom: Position the cell towers so that all customers receive optimal signal strength based on addresses, usage patterns, roaming, subscriptions, peak times, traffic patterns and roads, etc.
Scheduling: Say you’re a police chief trying to maximize your officer time with limited budget. You need to schedule patrols at peak times in the most crime-likely areas, again based on any number of factors, like time of day, weather, income and education levels, past crime events, types of crime, known gang locations, etc.
I personally use clustering all the time in my daily analytics work and I find that it has unrivaled abilities in telling a story about groups of data. Stay tuned for part 2 where I cover how to create your own cluster charts.