Joe Kleinhenz talks about his journey from starting out in data all the way to becoming a leader in one of the largest insurance organizations in the United States. We’ll learn about the importance of staying on top of technology, how to win hearts and minds of nontechnical folks, centralized versus decentralized team, pros and cons, how to hold effective conversations with stakeholders and how to go from individual contributor to leader.
Joe Kleinhenz: The critical skills you bring to the table is the ability to break down complex ideas into ones that translate for nontechnical folks.
Ginette Methot: I’m Ginette.
Curtis Seare: And I’m Curtis.
Ginette: And you are listening to Data Crunch—a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Data Crunch is produced by the Data Crunch Corporation and analytics training and consulting company. One of the biggest challenges companies have in getting value from their data is finding the right talent. Good talent is scarce and building a top-tier team is hard if not impossible for some companies. If you are having this challenge try out our analytics as a service offering: we bring a fully equipped data science team to bear on your projects, on demand and with no long-term contract constraints. If you want to start seeing success for your data science efforts quickly and economically, head over to datacrunchcorp.com for more details.
Today we’ll be hearing about Joe Kleinhenz’s journey from starting out in data all the way to becoming a leader in one of the largest insurance organizations in the United States. We’ll learn about the importance of staying on top of technology, how to win hearts and minds of nontechnical folks, centralized versus decentralized team, pros and cons, how to hold effective conversations with stakeholders and how to go from individual contributor to leader. There’s lots of unpack in this episode, so let’s get to it.
Curtis: If we could just start out just by talking about what got you interested in data in the first place, where your journey started, and we can go from there.
Joe: I actually first started thinking about using math to predict future outcomes when I was a teenager. I read a book by Asimov called Foundation and whole premise of the book series was I’m using mathematics to predict the future. It’s all science fiction stuff in it that point, but that’s kind of what certainly got me first interested in it.
Curtis: So it was a, it was a work of fiction that got you interested.
Joe: Yeah, that captured my imagination. I didn’t even at that point even know, it was a, you know, data science was a thing, and as I got my path into the technology, within IT, I was doing business consulting for awhile and got into data warehousing, and this was in the late nineties. From there, ended up in part of GE financial that was doing a lot of direct marketing, and they had a group called database marketing, which was essentially the precursors for data scientists. They had predictive modelers, statisticians essentially in there that were, by today’s standards, relatively simplistic tools like linear regression to build, you know, models predicting who would respond to direct-drip marketing offers. I used to joke with people that I ran a team of bad people that decided to call you at dinner with an offer. You can just have the here. Um,
Curtis: And you made those people very effective at, at being bad, I assume.
Joe: Yes. Yes. At that point there was very few restrictions on what you could do. We were even using credit data for some of the, the algorithms cause we were with credit card companies. Credit data at the time, there wasn’t the regulatory restrictions there is now, it’s incredibly predictive. When you combine that with recency frequency data on purchasing behavior, you’d really kind of tune in on, you know, what someone would be interested in.
Curtis: Interesting.
Joe: So you know, built one of the first customer analytics systems at GE financial, you know, took data across 10 disparate GE companies, integrated it together into one household view, and then built all of our models off of that household view with solicitation history across most of the country. Moving on from there, I had a number of roles in GE. Got another six sigma program for awhile. Really enjoyed that, you know, and it’s got its own data science aspect to it. Very stats focused though not built, you know, building a models, optimize process outcomes
Curtis: And just curious about what kind of . . . cause nowadays it’s sequel. Everything is like Python and R. What were you using back then?
Joe: Yeah. Back in the day it was all SAS. SAS was the only really, one of the only games in town. I mean there was a CSS but that was considered a subpar tool by most of the industry at that point. SAS really kind of control . . . was the 800 pound gorilla in the space, at least on the corporate side. And open source wasn’t really that accepted in corporate circles, especially in corporate IT circles the way it was in academia or I think it would start, they’re just starting out to be a thing there in the mid-two thousands or so, but no, it was all SAS. And then I joined Allstate and as soon as I was in there, I got in there and I was, I came in through the IT group. I said, Gosh, there’s gotta be a strong data science group here. It’s an insurance company.
Curtis: One of the biggest in the country, right?
Joe: Right, right, right. And if you think about it, what’s the core of an insurance company? It’s making good predictions on data. Even going back to the original insurance companies, Lloyd’s of London, it was slide rules and tables, and it passed data to make predictions on how much to price risk for. So I joined Allstate’s quantitative research group, which sounded super geeky to me, and that’s exactly where I belong. As I got in there, I was able to leverage. I had a diverse background going into the group. I wasn’t a classic actuary or anything.
They have some significant data science infrastructure problems to overcome. They at that point where a SAS shop and uh, they were looking to start to leverage machine learning algorithms. What they were finding very quickly at that time. This is back in 2010, 2011. Right around there. SAS didn’t really support a lot of machine learning techniques. They were getting one off tools for stuff like glmnets on gbms and really finding that to be an ineffective way to build models. It was expensive, it was slow, and it didn’t want scale. And as I’m sure you know, a lot of machine learning algorithms are data hungry and what I mean by that is they become more predictive, more data. You’re able to feed them up to up to a point. They’s diminishing returns. And so my boss at the time gave me an interesting problem. He said, “hey, I want you to figure out how to get the right environment and ecosystem in here to allow us to scale effectively and spend almost no money.”
Curtis: That’s a good, good challenge.
Joe: Yeah. I said, okay. I can, I can look at that. And so I leaned into some of my infrastructure IT background and started doing some research. Went out to a conference in KDD uh, analysis, discovery, and data mining, and saw a great presentation. I’m gonna mess up the guy’s name. I want to say it’s John Langford. He was, he was working Yahoo labs at the time and got to talk to him a bit. And they were really doing some cutting edge stuff at Yahoo at the time. As we chatted, he convinced me that really going into open source and moving into the Hadoop was the right path to get something that can scale effectively and cheaply. At the time web web companies, they couldn’t put in massive SAS farm. At scales they were dealing with it. It just wasn’t cost effective. They’d go broke. Even Google. Right. And so I went back and I started a, basically an information campaign, try to win hearts and minds, you know, some data science propaganda of why big data made sense, why Hadoop was our future, why machine learning was our future.
Curtis: Can we stop and talk about that a little bit? Because a lot of people come across this, right? It’s like trying to explain the benefits of data or changing to a better architecture or whatever it might be to people that are not data people and that’s, that can be a significant challenge. So I’m curious how you approach this.
Joe: The way I approached it was if we were to try to scale up traditionally this is what it would take and it’s just not tenable to continue to scale at that rate. And this is why we want to use these algorithms showing the difference in, and you can take all kinds of examples in your business. If you’re building models for pricing, for fraud detection, for you know, search engine optimization or marketing even we can show a greater lift from a more advanced technique. Then it becomes an accounting exercise to say, “Okay, if I apply this technique and I get on average is x number times more predictive than y technique, I should be able to generate this additional lift and this additional lift is going to translate into a certain amount of dollars. Then it becomes a ROI calculation. And that’s very simple to have with business folks. They understand that.
They understand that it’s essentially to bring it out. That’s something that I’ve really taken to heart throughout my career is one more critical skills you can bring to the table is the ability to break down complex ideas into ones that translate for nontechnical folks, and if you can take things that are these complex concepts and put them into terms that, you know, your business partners can understand, you’re going to get their support. If you tell them, hey, I need to go. I need somebody to run this gbms on at this scale because my genius score is going to be so much better. That’s going to make no sense to them. If you tell them, “Hey, I need to go build these algorithms with this new technique because it’s going to generate an extra $4 million for the company in your first year. That’s, that’s an easy discussion to have.
Curtis: That’s interesting. So you mentioned that you, you went in and you and you kind of knew if we use these methods, we’ll get this particular amount of lift over what we’re currently doing. Is that because you’d ran some tests before and you had those numbers or . . .
Joe: Yeah, I mean we’d run some stuff on smaller scale and said, “Okay, can we extrapolate out with a reasonable comfort level? And was it a statistically significant stratified sample randomized dah, dah, dah, no, but it was good enough that we felt comfortable going to our business partners with it.
Curtis: Sure. Yeah. It’s good enough to show some sort of lift and to convince people like this is the right path, right?
Joe: Right, right. You know, another data scientist probably look at that and go “ehhhhh” maybe not.
Curtis: Right. But again, you’re not convincing data scientists, right?
Joe: Right. I’m talking data scientist, look at the genius score difference, and they go “Yeah, yeah. Wow that’s amazing.” So, um, from there it was a matter of finally getting to the CIO. You know, at this point in our company, we, there was no open source running. I mean, not everything was still Unix OS, you know, ah some, uh, analytics. And so that was a really hard sell, the open source nature. And the way I got the green light was we said we’re going to do it in an isolated lab, so cannot touch our production systems. I was going to use end-of-life servers and free open source software.
So really the only cost was the, the human capital to install and configure this stuff. And I had to get a third party consultant in that actually had experience doing that because none of our IT folks knew how to do any of it. And so it took 10 end-of-life servers and for our benchmark said, “Let’s, uh, let’s take the hairiest ETL job that our Ab Initio cluster does today,” which was taking building a, uh, a transactional view of customer history. We had this job that ran the Ab Initio, the main Ab Initio cluster, that took a 10% sample across all 50 states and built this transactional history. And it took nine hours, and it was one of the biggest jobs, dim the lights type thing, nothing else could run. And we recreated that in MapReduce, and we ran that same job in 30 minutes at a hundred percent sample size, the country, not 10%.
So once we took those results back to our technology and business partners, it was “Okay, how much . . . now we can do this and that really kind of started us down that journey path from there, it became how do we, how do we expand this out to recreate an entire analytic ecosystem and really start to build up next generation deployment pipelines so that we can take our data science insights and rapidly deploy them out to the business at scale. And so we, at a time, there wasn’t a lot of data science deployment tools in the industry that weren’t tied to a vendor at that point, we were an R and Python shop, and we said, “R and Python might be it for now, and for the foreseeable future (and that, that was five years ago, but they’re still a primary tool sets), but it’s not always going to be.” We wanted to avoid proprietary vendor lock in. We don’t want to end up slave to new SAS.
And so we built this agnostic, uh, model deployment platform that those cloud based on where we could take a model object and serialize it and then put it up as a restful API endpoint or front end systems. And then through that, extrapolating out the decision layer out of the system. So that was a big win for us in, in being able to do that. From there it was a matter of, “Okay, now let’s start to rapidly deploy data science models across the company and really start to drive real change. What we found as a group, is to start to really get more victories under our belts and showing what data science can do. The greater the appetite of the company was to the point where eventually it’s across the entire enterprise, every function as a data science group now focused on it.
Curtis: So how did you go about doing that? I mean in this company you’re working in, right, here are some problems we think we can solve and this is how we’re going to do it and this is what we think the result is going to be. How did you find those opportunities? Like, how’d you go about thinking about that.
Joe: I had a great mentor early on in my career that gave me some advice. It stuck with me to this day and he said that the hard stuff is the easy stuff and the soft stuff is the hard stuff. And we meant by that is that math, algorithms, engineering, programming, those are the logical things. You can put enough brain power behind it. You’ll figure it out. People, organization, politics, behavior, humans, those are hard. Those aren’t logical systems, you know? And those are emotions. They’re, they’re, they don’t follow a set pattern, and they’re certainly not going to behavior predictable manner, not at the individual level anyway. And so what we did took a lot of partnership with the business and so it was a matter of how do you integrate into those different business functions and really become partners with a seat at the table to say we’re were part of you.
And you know, there was a great talk Facebook gave at Hadoop world, I want to say it was Hadoop World 2013, that really kind of stuck with me. And they talked about how they, how they thought about data science and Facebook, and they said that they looked at a centralized, decentralized organizational model, and they said decentralized was good because it made you embedded into business functions. You know, you were part of that, that department. You knew what their problems were. You were very proactive in solving them. You could see things clearly because you lived it. Centralized models are good because you can bring scale to bear. You had career pathing for your data scientists. You could create large chunks of organizational power and really shift resources and have a lot of synergy across the groups. Both structures have their weaknesses as well.
The decentralized, you have no scale. You had no community across. You have very little career pathing. Centralized you are very reactive to your business need. You’re trying to sit outside of the groups and say, “Okay, here are things I think you need, or here are ways I think you should do things.” And you were telling them how to do it and not partnering with them to come up with what are, what are the ways to solve our real needs? They actually took it to a hybrid approach and we did similar to say, “Okay, let’s create centralized groups, but let’s really dedicate functions within that centralized group to different business groups.” So how do you have a matrix group where you embed still in that business function but retain a central order to provide that scale, that career pathing, that community, et cetera. And that was really one of the keys was getting the right organizational structure in place and it’s, you know, it changes over time. I don’t know that I would try that structure for a small company starting out. You have to figure, you know, as you’re moving through your journey where that transition point makes sense to move from a centralized group to a hybrid structure. I don’t know that you know you’re going to be totally successful if you try to do a do the hybrid piece off the bat.
Curtis: In this hybrid approach, you’re sitting down and having conversations with the various stakeholders in different business departments. How do you approach those conversations? How do you come up with solutions together?
Joe: Yeah. Part of it is being in their world, eating their dog food every day, starting to see where their challenges are and working with their leadership as they’re coming up with their strategic plans. What are your pain points? Where as I sit and work with you, here’s what I’m seeing are opportunities where data science can really help you. Really kind of understanding where the biggest bang for the buck is going to be and how you can enable your business partners’ goals and make them successful. I mean at the end of the day you take your business partner and make them a rock star because they blew away all their targets and goals. That’s a win and it’s a win for both you and your business partner, and I, and I’ll guarantee you to do that for one group in your company, the other major leaders for other functions are gonna look at and go, I want some of that and that looks great.
Curtis: So now you’ve been, I guess, VP for a couple of years now. Are you doing the same kind of work that you’ve been doing up to now, just, you know, running this center of excellence, so to speak? Or is it a little bit different now?
Joe: You know, I think it’s, it’s always a little different. There’s other parts of our organization that are doing more long-term research and partnerships and all that good stuff, and they’re always finding new things to share out to the group and “Hey, here’s some new techniques we should look at or tools, et cetera.” And that, that makes it a constant battle to stay current. And how can we deploy these, these new, a new abilities and really what are the opportunities to help the business get better?
Curtis: And, uh, do you find, uh, that you do . . . I’m assuming that you probably do less coding now than you used to. It’s, it’s more as you grow, as you grow in leadership, it’s more about meetings and talking to people and less about the implementation. So I’m curious, do you enjoy that more or . . .
Joe: Oh yeah. It’s, it is not. Yeah, it’s, it’s interesting. That was an interesting pivot point in my career. The day I stopped, I realized I stopped doing actually work.
Curtis: Or at least the work that you thought of as work, right?
Joe: Yeah, exactly. When I stopped doing the thing that I thought that I loved and that I thought of as actual productive work, and I started creating PowerPoints and communications and strategy documents and sitting in eight hours of meetings a day. But what I quickly came to the realization was, as an individual contributor, there’s a very finite limit to the amount of value you generated. You can be the most brilliant person on the planet. And unless you’re one of those one in a trillion folks that, you know, like, uh, uh, Jobs or Gates or somebody, you know, there’s absolute limits to how impactful you are going to be. If you’re, you know, if you have a team behind, if you’re able to lead change, you’re able to drive strategy, there’s a multiplier effect to the value and the impact that you can have on a company that you can’t have as an individual. And I’ve got great technical folks that work for me, and they do amazing things, and I’m not, not undercutting their value in any way, shape or form. But for me, I just found that, I knew I was not that technical brilliant, I said, “Hey, I’m, I’m gonna reach a limit and as an individual contributor and really I want to drive real change for large companies from these large enterprises.” For the time I was working at multinationals, you know, really get into leadership and driving strategy around data science, it was the way to go.
Curtis: In terms of advice, right? You’ve been through this process now. You’ve seen a lot of things. Um, if there are listeners who you know or are looking to see how they might make that change from individual contributor to more of a strategic leadership role, what, what would you say? Like what’s your best words of advice for those people?
Joe: So I guess one, I would make sure that it’s something that you enjoy. As humans, we spend a majority of our waking hours at work. If you’re going there and it’s not something that, that excites you, that makes your passions, find something else to do. Life’s too short. And I would say also make sure your, your ability to communicate, your ability to see the bigger picture and your ability to think abstractly is there. It’s not formulaic, it’s not logical always. And so those soft skills, those behavioral skills come very important. The ability to influence, we talked about earlier on. If you feel comfortable with that part of your skillset, you’re able to build those muscles up. That’s why I’d say that’s critical. Gotta be able to speak the language around the technology, around the science part of it. You have to be able to understand how to apply it and see where the opportunity are in the bridge between business and the data science pieces. But the human aspect is really in my experience, critical.
Ginette: A huge thank you to Joe Kleinhenz for telling us about his journey in data science. We hope you enjoyed the show. One last note, we are a small team, and you can’t imagine how helpful and energizing it is for us to get reviews on iTunes and other podcatchers. It really does mean a lot to us. So if you like the show, and you’ve been listening for a while, please give us a review, and we look forward to hearing from you.
Attributions
Music
“Loopster” Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0 License
http://creativecommons.org/licenses/by/3.0/