Creating an Insightful Cluster Chart
Like we discussed in Part 1 of our clustering series, the ability to segment data into useful groups or bins is as important as ranking and identifying your top and bottom values. It’s a must for any data analyst. Clustering takes that ability to a whole new level. You don’t need code or need to be a trained statistician to access it.
Clustering excels at visually seeing the relationships between data. For instance, we might wonder: “how do these 6 things interact together and what results do they produce?” What if we wanted to add Measures instead of Dimensions? For example, purchase patterns (Sales) and amount we actually make (Profit) and return or discount patterns (Discount, Returns).
Clustering allows us to add this additional information. This helps us move beyond simple segments to advanced, incorporating data on behavior patterns and actions (Measures), as well as attribute information like Region or Marketing Channel (Dimensions).
Let’s jump in and create a cluster chart from the Superstore Dataset that shows the relationship between sales and profits with highlighting of other fields such as marketing channel or product category. We start with a view with these fields pulled out:
- Click Show Me up at the top right and choose the Scatter Plot option to get this into more of a useful format. You’ll see that Marketing Channel and Region are on the Shapes and Color shelves, respectively.
- Set it to Entire View from the drop downmenu at the top.
- Now, let’s add several more Dimensions. Add Product Category, Customer Segment, and Product Subcategory to the Detail shelf.
- Click on the Shapes card to set each of the marks to Filled from the drop downmenu labeled “Select Shape Palette.” Choose Assign Palette and click Ok
- Now, click on the Analytics tab at the top left, above your Dimensions.
- Click Cluster and drag it out. Be sure to place it on top of the Cluster box that appears.
- Notice that 2 clusters are generated automatically from the data.
8. Let’s play with the number of potential clusters. Change the number from Automatic to 5. You should now see the different colors.
- Go over to top right where the data highlighter shows the different clusters. Click each one in succession to highlight that segment on the scatter plot. Are you seeing some interesting groups, like a group of high sales, low profit?
- Click on the down arrow on each pill that you put on the Detail shelf and select “Show Highlighter.”
- These should appear on the right-hand side. Click through these to see if there are any interesting insights that emerge. For example, under the Marketing Channel highlighter, choosing “SEO” or “Social Media” reveals some interesting insights. Or choosing “Google Adwords” reveals an interesting outlier.
With this advanced clustering chart created, stay tuned for part 3 where I cover how to interpret, explain, and visually fine-tune cluster charts.