How do you build a comprehensive view of a topic on social media? Jordan Breslauer would say you let a machine learning tool scan the social sphere and add information as conversations evolve, with help from humans in the loop.
Ginette Methot: I’m Ginette,
Curtis Seare: and I’m Curtis,
Ginette: and you are listening to Data Crunch,
Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.
Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.
Ginette: Many of you want to gain a deeper understanding of data science and machine learning, and Brilliant.org is a great place to dig deeper into these topics. Their classes help you understand algorithms, machine learning concepts, computer science basics, probability, computer memory, and many other important concepts in data science and machine learning. The nice thing about Brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational.
Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.
Let’s get into our conversation with Jordan Breslauer, senior director of data analytics and customer success at social standards.
Jordan: My name is Jordan Breslauer. I’m the senior director of data analytics and customer success at social standards. I’ve always been a data geek as it pertains to sports. I think of Moneyball when I was younger, I always wanted to be kind of a the next Billy Bean and I, when I started working for sports franchises right after high school and early college days, I just realized that, that type of work culture is wasn’t for me, but I was so, so into trying to answer questions with data that had no previously clear answer, you know? I loved answering subjective questions like, or what makes the best player or how do, how do I know who the best player is? And I thought what was always fun was to try and bring some sort of structured subjectivity to those sorts of questions through using data. And that’s really what got me passionate about data in the first place.
But then I just started to apply it to a number of different business questions that I always thought were quite interesting, which have a great deal of subjectivity. And that led me to Nielsen originally where my main question that I was answering on a day-to-day basis, what was, what makes a great ad? Uh, what I found though is that advertising at least, especially as it pertains to TV, is really where brands were moving away from and a lot of the real consumer analytics that people were looking for were trying to underpin people in their natural environment, particularly on social media. And I hadn’t seen any company that had done it well. Uh, and I happened to meet social standards during my time at Nielsen and was truly just blown away with this ability to essentially take a large input of conversations that people were happening or happening, I should say, and bring some sort of structure to them to actually be able to analyze them and understand what people were talking about as it pertained to different types of topics. And so I think that’s really what brought me here was the fascination with this huge amount of data behind the ways that people were talking about on social. And the fact that it had some structure to it, which actually allowed for real analytics to be put behind it.
Curtis: It’s a hard thing to do though. Right? You know, to answer this question of how do we extract real value or real insight from social media and you’d mentioned historically or up to this point, companies that that are trying to do that missed the mark. What’s the standard for how things are going in that industry?
Jordan: Yeah, that’s a great question and I dealt with some of it traditional social listening around measuring different events like the super bowl, which I was heavily involved with during my time at Nielsen, and I just found that a of traditional social listening is really based upon the construction of what are called Boolean queries. And these rely heavily on known knowns. So even the best Boolean query there is out there can’t provide you a comprehensive view of the market or products or categories do the impossibility of including every single variation of a particular topic. Think about just how many ways there are to say a brand such as Budweiser, including all the misspellings in English alone. As soon as you create a Boolean query, they’re pretty much outdated because of the new amount of brands and products and terms that crop up all the time. I mean, an analyst could literally make a living just by maintaining queries that have already built and built while suffering the paranoia that comes with not knowing whether their findings are a product of the data that the query itself.
And so as a result, traditional social listening has really been able to only reliably deliver on helping clients. But crisis management and event base or campaign tracking which occur over short periods of time and have a clear list of terms to track. It has not been able to deliver on a key premise, which is real consumer analytics. And that’s where we come in. Essentially what we’re doing is we’re scanning the social sphere, adding new brands, products and terms as social conversations evolve and with these new items added literally every day we remove the burden of managing queries. But more importantly, by taking this data and structuring it by market vertical, similar to how a Nielsen and IRI or an MPD structure sales data, we are able to truly understand the types of ways that people talk about a specific topic and more importantly, how it is differentiated from other topics within the marketplace.
I mean, to make the analogy to sales data, imagine a post on social media being akin to a receipt. Social listening gives you access to individual receipts, but then ask them for you to analyze them yourself. For a topic that has thousands of posts, that can lead to subjective interpretations as opposed to objective measurement. I mean, humans have a tendency to make assumptions off small bits of information. We’re all about objective measurement here, so in examining a topic, we look at every post and we treat them like receipts by structuring everything upfront into their market verticals. Then much like a Nielsen does, we provide a summary of the data which allows you to understand the full breadth of a topic, whether it be a brand, a product, or even a benefit and concern like sustainability within the vertical that matters to you, whether that be food and beverage or beverage, alcohol, or beauty.
And so as with sales data, having a receipt for one brand without being able to compare it to the rest of the market would provide false insight. The same is true of social data. You need as close to a comprehensive view as possible in order to comparatively measure anything and to kind of finish that thought. I mean let’s think about the meaning of a word linguistically. The meaning of a word all comes down to how it’s used in a sentence such as what are the things that are spoken around around it. Words define themselves through context, which is one of the reasons that children are able to learn how to speak from hearing others communicate and experiencing the world around them. Applying that same logic as a pertains to brands and products. We can help to contextually define what brands and products mean in the consumer’s mind by examining what’s they’re spoken about alongside and how this differs from what you’d expect to see amongst other social media topics within the context of a vertical.
Curtis: So, let’s dig into that a little bit. How, how, how do you even approach that? I mean you’ve described how the problem works and what the, what the approach is, how do you train algorithms and these kinds of things to actually understand all the different variants of a term. Right, and understand the context of a term.
Jordan: Yeah. It all comes down to defining the universe linguistically. So certain topics, let’s say lipstick, they have a beauty connotation, and so once you start training the system to learn all the different topics that specifically pertain to beauty, you can help to say whether a conversation is actually a beauty conversation or a conversation about something completely different. A great example is a brand like Elf Cosmetics. I mean, imagine how hard it is to try and make sure that you’re just getting posts about health. I mean, if you were just put out the word “elf” that could be a post about Elves on the shelf around Christmas time or people dressing up in cosplay to look like elves, where people talking about Lord of the rings, characters that are elves. So in order to truly be able to know that it’s a real Elf cosmetics or beauty conversation, you need to make sure that it’s not just the word “elf” that is used, but there’s other topics that pertain to beauty that are mentioned around it.
And once you define those, the system can become smarter and smarter and define itself over time. Now there are of course topics that are less ambiguous on social, right? Um, and as it pertains to like something like Elf Cosmetics, you could have a specific hashtag that literally just says #elfbeauty or #elflipstick and those help us to make sure that it’s very clear that a specific topic is being indicated. But a specific topic. It’s not just said in one way there’s so many different ways that a topic can be said and our linguists are constantly updating the database to make sure that’s being shown and a machine is helping them to indicate different topics that are coming up more and more often on social media that they should their eyes on to help to update and maintain the database.
Curtis: One of the big problems with artificial intelligence, machine learning, is this having enough data that’s labeled correctly that you can actually learn something from and keeping things up to date. Are you guys doing, it sounds like you have some sort of system where the machine learning system works with the humans to consistently update what’s going on and sort of directs the humans to the cases where it needs a little bit of help.
Jordan: Exactly. I mean it’s all about human in the loop as we like to say. It does this. This does not work without human input. I mean, I think a lot of people like to rely upon machines these days and have everything be automated. But the truth is when it comes to something as complex as language, it needs human input in order to define itself properly. Of course. I mean, if you think about the analogy back to a sales database, right? Sales and brands, when they’re in a store and you scan them, they all have a UPC code. And those UPC codes indicate a lot of information about the brand that is simply factual. I mean, think about the purchase of something like diet Coke, vanilla. Diet Coke vanilla has a UPC code which contains several pieces of information about the product, which roll up into different parts of the Nielsen sales hierarchy.
So diet Coke vanilla rolls up into diet Coke, which rolls up into Coca-Cola as a brand, which rolls up into Coca Cola as an organization. And diet Coke is also a diet soda, which rolls up into the soda category. So that hierarchical structure allows users to pull sales data on different relationships, like what percentage of diet soda sales come from diet Coke, vanilla? Or what percentage of Coca Cola work sales come from diet Coke, vanilla? We’re doing that same exact thing except with social data, assigning essentially a UPC code to each social topic and the team of linguists working with the machine help to define the different ways that somebody could talk about or describe a specific topic on social. But as you can imagine, given the intricacies of language, this is much, much more complex than a UPC code, which are much more factually defined. And so that’s literally, I mean our team of linguists are working constantly to update the database both with new topics but also refining the existing ones. And so a mention of diet Coke vanilla within our database not only indicates diet Coke vanilla, but also implicitly indicates Coca-Cola, the Coca-Cola organization, diet soda and soda as a whole.
Curtis: So once that’s detected, it is then mapped to this hierarchy where you can do comparisons in these kinds of things, is that right?
Jordan: Exactly. I mean, it really comes down to understanding those relationships in the way in which people talk and the way in which people think. That has huge implications on trend tracking, understanding the differences between different consumer cohorts and really getting down into understanding how brands can leverage their unique strengths, but also understand their weaknesses relative to the competition.
Curtis: Got it. That’s awesome. And maybe let’s talk a little bit as well about, uh, you know, how you take this data or how a brand would take this data. We’ve talked a little bit about it, right? But can you give me, maybe give us a concrete example or maybe something interesting that’s popped up in your data where a client learned something interesting and was able to then take some sort of action that, that delivered business value.
Jordan: Yeah, absolutely. So there’s a brand we work with that was seeing decline in social volumes over time and as a brand as well from a sales perspective the social is kind of aligning with that. Their sales were beginning to lag over the course of the last year. And so one of the things, uh, they wanted to look into was, are the things that are being talked about alongside their brand more significantly than other brands on social increasing or essentially on trend? Or are they declining and maybe not on trend anymore? And so when they did this, they noticed that one of their primary ingredients has actually been one of the fastest declining within all of their category. And so that piece of information allowed them to understand that, “wow, one of the things that is really core to our perception amongst consumers has started to fall off.
And so how could we start to pivot, maybe consider product reformulation to help to better cater to the current consumer, the consumer that is trending or the, the desire, I should say, the desire consumers that are more on trend.” And so what they started to do here is they began to investigate what are some ingredients that deliver on the same value proposition. For example, maybe they also offer moisturizing benefits in the consumer’s mind or also offer anti-inflammatory benefits in the consumer’s mind. And then from there, look at amongst the ingredients that deliver on those premise which ones are actually trending because that could be the beginning of helping to replace this core ingredient as a part of their product formulation and pivot to another ingredient that is more on trend.
Curtis: Yeah, that’s really interesting. So even down to what kinds of products you build, how you engage on social, what you talk about, you can map all of that through through the system.
Jordan: Exactly. It’s, it’s a full service system in that way. It’s not just understanding how that product should be reformulated, but then eventually how we can speak to it, how we can market it, both on social but also in our integrated marketing strategy overall. So it’s really using social to drive consumer analytics. Social is merely the input. The output is much broader and much greater. I mean we have huge application as well in the diligence space, both in the sourcing and also in the diligence process. Um, because what different private equity firms are trying to do is they’re trying to identify brands that are currently on trend. And so if I identify one of the biggest trends and beauty, let’s say self care, maybe if I believe self care is going to continue into the future and I’m seeing its user sticking around and talking about it month after month and it’s growing like crazy in social conversation volume, I’m going to want to look for the brands out there that have self care most integral to their conversations ’cause these are the brands that have self care continues to grow, all else equal, should benefit more than the average brand out there. And this is a really interesting way that we can use relations, even social data to make these real important business decisions as it pertains to finding new brands to acquire or finding the next big competitor in my space.
Curtis: These two use cases, you have brand strategy that helps them even make new products and then engage in social saying certain things and doing certain topics. And then you also have this diligence use case. Did you guys start with that in mind that you said, “Hey, we could do this and it would be useful for these things?” Or were they kind of emergent as you were doing the project?
Jordan: That’s a great question. I would say the prior was the more obvious one. I think the latter appeared more naturally and I think it came out of just having different conversations and really just not being afraid to show our product to anybody. I think we really believed that we were on the verge of something special and we constantly thought of different applications on a day-to-day basis and continuing to get that validation from others in the industry really helped to drive home just out all the different types of places where we could fit. Nevertheless, we’re certainly trying to very much focus on those specific use cases right now as um, you know, let it grow naturally. Let the rest of it, um, happen as people continue to explore the tool and find all the different applications of it. You know, one thing that also has been really been interesting to me is people have thought of using the tool to um, try and figure out what are the types of keywords that we should be buying, look at what is most significant to a topics conversation and then consider, “are those other types of words that maybe we could consider buying when it comes to search to try and cater to those sorts of consumers?
Jordan: Cause if moisturizing connects to anti-inflammatory, maybe I want to make sure I have those two keywords zoned as it pertains to my brand in the search space.”
Curtis: Sure. You mentioned you’re mostly on the analytics side. What are some of the challenges and difficulties there and you know, trying to process this data and . . .
Jordan: Yeah, I think that, you know, we’re not industry experts. We’re experts of our data. So I think the product truly comes to life when you work closely with clients and you get their input as they’re the experts in the business. We’re the experts of the data. Collectively we’re more effective than just us individually. Sometimes going in and, and just finding what is most relevant in the data. Having that color from the client, having that constant communication back and forth to really say, “Hmm, this data point is because we did X sort of launch. And it’s really interesting to see the movement in the conversations this way as a result.” That sort of color is really, really important to help bring the data to life. I think one thing that’s really continuously surprised me pleasantly is I know it may come as a surprise, but I’m not a beauty expert. Yet, when I go into meetings and I talk about our beauty data, I appear to be an expert because the data is so well structured and so easy to read once you become an expert in it, that you can intelligently speak to any sort of topic just by using the consumer as your tool to speak to these things and to explain to others what is being seen in the industry.
Curtis: That’s really interesting stuff. If there’s anything else you wanted to bring up or talk about or you know, let people know how they can reach you guys?
Jordan: Uh, I would say, um, you know, please reach out through our website, socialstandards.com and check out all the materials we have. I think I’ve really dug into a lot on what we do in terms of the structuring side and how it pertains maybe to how people do consumer analytics. But one thing I didn’t get to mention was given we built all the different ways that people talk on social, we actually went and built an influencer tool on the back of it, which is quite unique. Uh, and most influencer tools look at the influencer and what they talk about in order to understand whether people should be working with that influencer. But what we did is we flipped the model and we said, let’s look at influencers, audiences and analyze them in order to understand whether you should be working with that specific influencer. Really flipping the model and looking at influencers by analyzing their audiences. So if I want to target a specific audience, like 25 to 29 year olds that are into self care and into moisturizer, then I’m going to search for influencers who have that sort of audience, uh, over index relative to others.
Ginette: A huge thank you to Jordan Breslauer for being on the show, and if you’d like to read the transcript or see any of our attributions, go to datacrunchcorp.com/podcast. See you next time.
Photo by NeONBRAND on Unsplash
“Loopster” Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0 License