Machine Learning with Max Sklar

What does it look like to shape a social media platform? Max Sklar can speak to that having worked on machine learning at one for nearly its entire existence. In today’s episode, he talks about what’s worked and what hasn’t at Foursquare. 

Max Sklar: So Foursquare’s a very positive platform, but the Japanese language is off the charts. You have 97% positive tips. Whereas the Russian language has the lowest; it’s only 85% positive, and so different languages. We just see at different cities and towns and geographies, we just see different patterns.

Ginette: I’m Ginette,

Curtis: and I’m Curtis,

Ginette: and you are listening to Data Crunch,

Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.

Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.

Ginette: “If you’re interested in learning more about algorithms and machine learning, like we talk about on our show today, you should check out Brilliant—it’s a great place to dig deeper. Their classes help you understand algorithms, machine learning concepts, computer science basics, and many other important concepts in data science and machine learning.

“The nice thing about Brilliant is that you can learn in bite-sized pieces at your own pace, and with a bit of consistent effort, you can tackle some really tough subjects. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational.

“Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.”

Now onto today’s show. Today we chat with Max Sklar, previously a machine learning engineer and currently the Engineering and Innovation Labs Advisor at Foursquare. He also hosts The Local Maximum podcast, a show about machine learning and data science concepts used to analyze news, current events, and various situations.

Max: Foursquare has changed a lot in the, it’s hard to believe, nine years since I joined back in 2011. So the first few years it was mostly a, it was, not mostly, entirely a consumer application company. So they had one app that was called Foursquare. You could check in and tell your friends where you are. And then there was also a recommendation city guide aspect of it, which is now called Foursquare City Guide. And that’s kind of the area where I worked. One of the areas where I had the most long-term success in, which I define as putting out a product that people liked, a lot of people use, and that has held up years later because that’s pretty rare is there, there’s a lot of rare things about that product, which is the venue ratings. And so that’s the algorithm for determining what the 1–10 rating is on a venue in Foursquare, whether it’s a, you know, a restaurant or a bar or anything like that.

And so I kind of dove into this maybe my first and second year at four square. And it felt like a lot of responsibility because it was like, “well, I’m new here.” And it’s like, okay, I’m trying to come up with a global score. And I don’t know if it was taken as seriously as it should have been because a lot of people at the time were focused on personalized recommendations, which we did a lot of too, which are also interesting, but we realized very quickly, you know, we needed a global score that just was Foursquare’s opinion on how good something is. And we tried all sorts of different configurations in terms of how to make it work. And one of the main ingredients that I’m kind of most proud of because of how it’s held up is how we built sentiment analysis on Foursquare Tips, which are two or three sentences.

And so that’s just a fascinating case study because what we ended up doing was, so Foursquare has a like, and a dislike button and also like a “meh” button. So three, three buttons that you could push to kind of rate a place, but using those alone didn’t give a, give us good enough signal for the rating because it was just too easy to like something so many, we get so many likes on there, but we could use that and pair it up with people who also left text, who also left a tip. And that gave us a training set, internal training set that we could use to train a machine learning algorithm that does sentiment analysis on the tips, which we needed because the pre-trained ones that were already out there, were not working on our two, three sentence things. And what was great about this was we got, we got data for so many languages, I think 97 different languages.

And it gets retrained every week. It’s been retrained every week for like eight years or something like that, maybe six years, because I w we kind of redid this whole thing in 2014 with my colleague, Stephanie Yang, we have a post about it. I think it’s like it’s called ah, “Getting to the Perfect 10” on the Foursquare engineering blog, doing sentiment analysis on people’s texts ended up working pretty well. We combine that with some other signals, but that was the real breakthrough. And once we got the test and training set, the algorithm was just kind of extra on top of that based on a four gram model and an elastic net logistic regression with a fourgram model, essentially, plus ah, some language detection on the backend. So some pretty standard pieces in there, but the way it all came together worked really well.

Curtis: So what does it take to be a 10 on Foursquare? ‘Cause I’m assuming you’re doing some NLP on, on the two or three sentences you have to work with. Are you kind of looking for really positive language or certain words, or just in general, how does that work?

Max: It does look at phrases. It’s actually harder to trick the algorithm than you think. Another reason why tips work is that Foursquare has a pretty good system of getting rid of spam tips. So that’s, that’s why tips works so much better than likes and dislikes is because the spam system sort of filters them out pretty efficiently. So if you want your place to be rated in the nines, what you need is a lot of positive tips, people writing good things about it, and then, uh, likes help it. And people actually going to your place helps which, uh, these days a few people might have to temper the expectations for a few months, but ah . . .

Curtis: Do you guys have to account for that? I mean, this is like a, in the data set like this, this is sort of an unheard of events, right? Are you going to account for what’s happening with COVID?

Max: All this stuff is normalized. And, so by the way, a lot of these algorithms have not been touched for years and they’ve held up pretty well, which is again, very surprising. In fact, the sentiment analysis has gotten better, which is I’ve never in my life before or since written code that actually like where it actually got better over time, but it’s because when people added more data to Foursquare, more tips, like the like more languages started working and it learn more words and stuff like that. So, I mean, that’s how machine learning should work. Right. But it’s so, uh, it’s, I’m so excited that it actually happened in this one case, but a lot of this stuff is normalized by your area. So visiting is normalized. So it’s like, you know, yeah, we’re not going to compare something in New York City with how, how much it’s visited to a, you know, something in a, in a rural town with how much has that visited.

So it’s, it’s normalized to area the same thing with how positive something is. I mean, it’s, so Foursquare, a very positive platform, but the Japanese language is off the charts. You have 97% positive tips. Whereas the Russian language as the lowest; it’s only 85% positive, or to flip it around, it’s like 3% negative to 15% negative. And so different languages. We just see at different cities and towns and geographies, we just see different patterns. So we try to normalize by that. And I presume that normalization is kind of helping a little bit with the pandemic data or the changes that have gone on over the last couple months.

Curtis: Yeah. Even with that swing, I think that’s a good, that’s one good sign for humanity. Right? I mean, the majority of all tips are positive as opposed to negative, you know? That’s good.

Max: Yeah. Yeah. I think, I don’t know if it says much about humanity. I think it says more about the platform where originally, like, it was called a “tip,” not a “review” because it was, this is something that you should do while you’re there. Like here’s the best thing to order or here’s something to check out or talk to my friend who works at the bar or something like that. So it kind of trained people to be a little more positive.

Curtis: Got it. Okay. So you’re sort of nudging humanity towards the positive side of things and uh,

Max: Okay. We can, let’s say that. We can say that.

Curtis: Right. Okay. I mean, that’s interesting that this is, has gotten better over time. I mean, I’m used to hearing about algorithm drift and this kind of thing, where things worse over time. Do you think that this has something, did it, or was there something you did in the code or, or is the problem set here just such that it’s constant enough that you don’t kind of get these data elements that maybe the algorithm doesn’t understand or something like this to make a drift?

Max: So I don’t claim it. The ratings have gotten better over time. I’m pretty sure there’s some drift in the ratings themselves, but what has gotten better is the, the sentiment analysis algorithm. And I think, I think the reason is that the data set keeps growing, and it’s, the algorithm is not too complicated. That that does help us a little bit because I feel like there’s less, that can go wrong. ‘Cause it’s just like, you know, a four gram model and with logistic regression and, and, you know, elastic net, you know, it means that a lot of words are not given a sentiment value. So if you look at four grams, that’s like lists of four words, right? And most lists of four words should not have a score. So ideally I would like to have a word like “spaghetti” be neutral, but a word like “delicious” be positive.

But if you look at the actual data, like it doesn’t actually pan out that way all the time, but that’s kind of how it works. In fact, like “spaghetti” should be at well, it’s usually neutral, but it’s not, I, because I’m using elastic net, I want it to be exactly zero, but it doesn’t that doesn’t always pan out. But I feel like because it’s, it’s simple enough and like the data set only increases. We’re constantly getting new tips from Foursquare users. Maybe not as much as we would like, because people know, you know, Foursquare . . . people are often surprised that Foursquare is still alive and kicking it is we still get paid, but, you know, yeah. It’s, um, not maybe at the forefront of people’s minds right now, but I just think that it’s something that is, it could be a mixture of like its simplicity and the fact that we’re getting data at the source. Like we’re, we’re getting data from the source that we’re going to apply the algorithm to.

Curtis: That’s good. Yeah. I mean, complexity is one of the biggest enemies of good data science. So that, that is very interesting. Now you also mentioned, you also mentioned some success in building inference into your advertising product, which I wanted to touch on and see what your approach was there and what you saw.

Max: Yeah. So that was that’s Foursquare’s attribution product. So I’m sure a lot of people are not familiar with what attribution is or what ad attribution hopes to achieve. I certainly wasn’t familiar with it when I jumped into it. Foursquare has this panel of where people go and advertisers want to know, “,okay, I put out this ad, it doesn’t matter where I put the ad. I put the ad somewhere and I told people to go to my store.” I usually use Starbucks as an example because everybody knows Starbucks. Everybody likes Starbucks. I don’t think that Starbucks is a client of Foursquare. That could be wrong, but Starbucks puts out an ad to try to convince people to go to Starbucks. And then they noticed that people are going to Starbucks who saw the ad. But the problem is that they also targeted people who had a propensity to go to Starbucks to begin with.

So that raises the question. Did I just happen to target people who are already walking in the door and claim credit for it, or did I actually cause those people to go to Starbucks who wouldn’t otherwise go? And so that changes the whole equation on whether I’m getting my return on investment on the ad or not. And so that’s what attribution is there to measure. And it is a very difficult problem fraught with pitfalls. And I think in the industry fraught with a lot of numbers that, that you can’t, uh, you can’t trust very well. And so what we ended up doing was to say, “all right, we’re going to build a model that predicts how likely each individual is to go to Starbucks at a given day. And we’re going to base that model on everything we know about them, their age, their gender, whether they’ve been there before all of that stuff. Then we’re going to see how that changes if they’ve seen the ad or not.”

So, you know, there are a couple of ways to do this. You can actually make exposure to the, ad a parameter in the model and see where that parameter ends up. Or you could make a separate model for people who did not see the ad and then just have a sense of, “okay, how many do we expect to go?” And then given that this group saw the ad compare how much we expected to go versus how many actually went. And then we get a measurement that way, but it took many months of like testing and trial and error and debugging and making sure that these models didn’t have you know problem, making sure that the data set was representative of what we were measuring. Because even if you’re off by a tiny bit, it throws off the whole calibration. So that product was a little tough because a lot of the clients didn’t understand what we were doing.

It was hard to tell. I feel like maybe the, uh, the, the communication channels with the clients included people that didn’t understand what we were doing. Whereas I hope the people who are writing the checks for these ads and their analytics team. If we got together, one-on-one, we’d be able to figure this all out, you know. You know, it goes through sales, it goes through this and that. It ended up working pretty well. People were excited about it, but the big ending to that came when Foursquare bought this company Placed back in, in 2019, early last year, which feels like 10 years ago. It feels like ancient history now, but ah Foursquare bought place. The CEO of Foursquare David Shim was actually the CEO of placed. And so that was a company that only did attribution. And when they merged the two companies, they kind of figured, “okay, how do you do it and how to, how do we do it?”

It was like, “Oh, we actually do something pretty similar.” So that kind of helped that merger. Yes. It was validating for me. Yeah. I, uh, I actually wasn’t at Foursquare at the time. I was somewhere else for like six months. And I was like, “well, I don’t know what’s going to happen at Foursquare.” Then this happens. And it was, I mean, it was sad for the people who worked on the product. ‘Cause then you have to do the whole merging of products thing, which is, is not always, well, maybe some people find it fun, but I’m pretty happy. I didn’t have to do it. But, uh,

Curtis: So if you were to do that again, I mean, that was a big project and lots of things involved there. What would you do differently to get to a result faster or tackle it in a more efficient way?

Max: When it comes to the actual learning algorithm keeping it as simple as possible is what I would do. I feel like maybe Foursquare’s data pipelines are pretty complicated. I don’t know if, if there’s a place that does it any better, but a lot of time at Foursquare in terms of our technology is for me, is spent when I’m in the data pipelines is spend like, you know, debugging and figuring certain things out. Maybe that’s just not my thing. So that, that took up a lot of time. There was a lot of moving pieces. That’s all I could say. There was a lot of moving pieces, and if I could find a way to get a result sooner I could, but it’s, it’s, you know, so many things come up with this that are crazy. So that was one of the ones is, is negative lift that’s when we get a measurement that says that, uh, uh, ads are actually turning people away from your store.

And, uh, you know, people don’t believe that that’s true. I actually think that sometimes that’s true, but I think for most of the time it was, you know, “Hey, uh, 30% of the ads we’re seeing something like that just don’t work at all. So if that’s the case, then you have some that go a little bit below and a little bit above.” And so it was very tough to explain that to people, like why are 15% of them below zero? There was one that was very negative. And it was because it was a haircut place giving people ads to get another haircut when they just got a haircut, and they measured it for like, you know, uh, three weeks. So of course they didn’t get a haircut. So things like that come up that are just crazy and that are never one size fits all.

Curtis: It’s interesting. You know, sometimes the data shows something, people don’t believe is true, but, but there’s something behind that, right? Usually unless it’s a bug or, I mean, obviously you have to sanity check what you’re doing and make sure that your pipelines are working and that kind of thing. But, ah,

Max: Exactly. Like there’s always some, oftentimes there’s some explanation, but it’s very, uh, you do have to use kind of the scientific method to say, “okay, what are all the possibilities here” versus just, um, believing one thing and then basing all your decisions on that, on that one thing. Uh, because, uh, in this case, there’s, there’s so many moving, moving pieces.

Curtis: I think that’s a hole a lot of people fall into, you know. Challenging people’s assumptions is, is a difficult place to be, which is why data science is so hard. I mean, it’s hard technically, but it’s also hard just communicating and persuading and all this, so.

Max: Yeah, absolutely. That’s been my experience.

Curtis: Yup. I think, I think a lot of people, you know, they go into data science ’cause they’re like, “yeah, this is like, I can solve interesting problems.” And then you hit this wall of like, no one believes what you’re telling them or, or the politics or whatever, you know, there’s, there’s a lot of that, that unfortunately you have to deal with. So that’s, that’s great. And then you recently, you told me you are now on the Labs Team, which sounds really cool. And before this whole COVID thing hit, you were working on some interesting things which hopefully will come back online here. Once everyone Can go back out.

Max: The labs seems so that’s why I came back to Foursquare. This is the brain child of Dennis Crowley, who was the founder of Foursquare, he’s former CEO. And basically the idea is that, you know, we work on longstanding like ideas and concepts that we think would make city life better. That would kind of be, would kind of make people in the tech industry kind of turn their heads and go, “Hmm. Oh, wow. That’s interesting.” Also to showcase Foursquare’s technology. And so what I like about it is it’s a very different way to work than most engineers are used to, but it’s, it’s sort of my preferred way to work, which is just to have a small group of people, knock around ideas, build stuff, put it out, see if it works, try things again and again, and sort of juggle a lot of different ideas and a lot of different threads.

And then see what happens over a long period of time. It’s rare because companies are very hesitant to invest in that kind of team because they don’t know what the end result is going to be. Ummm, and again, the benefit here is that we have the founder who wants to do it, who has seen through experience that this is this yields, good results over time. We’re working on something called Marsbot audio. We might call it Marsbot for AirPods now to emphasize that, Hey, a lot of people are walking around with headphones and AirPods and things like this. And we worked for months to try to get it so that as you walk by a certain place, we want to trigger an event. We want to trigger a sound as you walk by a certain place. And so we have a few kind of standard, Foursquare things in there.

Like we point out, Oh, this is the best. This is the highest rated ice cream shop in these village that you’re walking by right now. You know, things like that. And then we’re also allowing people to leave sounds, leave MP3s that they can record on their phone. And you know, I’m also making some, uh, some content packs where, you know, we’re trying to build some, some city tours and things like that, but I’m imagining like people doing kind of sound effects and I don’t know what people can do with it. But the core technology is showing, “Hey, Foursquare equals location. And so we know how to do this really well,” but then there’s also the aspect of it. Like, “Hey, we’re going to try to build something. That’s going to bring content to you in a new way. That’s going to raise eyebrows. That’s not just what everybody else is doing.”

Curtis: Yeah. Now is it, is it just sound, are you doing video as well? That’s locally based or what’s the thought there?

Max: Yeah, it’s just sounds because it is meant to be a headphones first app. So it’s sort of like an app that is only in your ears. Now there’s an app as well on your phone where you can like change the settings and stuff. You could say, Hey, I don’t want to hear anything. Or I want to hear everything. You’re only want to hear stuff that my friend says or things like that, but we’re trying to make it something where you never have to take your phone out.

Ginette: Thank you, Max Sklar, for being on our podcast. If you’d like to see this episodes, transcripts or attributions, head to datacrunchcorp.com/podcast.

Attributions

Music

“Loopster” Kevin MacLeod (incompetech.com)

Licensed under Creative Commons: By Attribution 3.0 License

http://creativecommons.org/licenses/by/3.0/