Jason Kolaczkowski has worked in both a large-company data shop and in a company trying to help large companies fix their problems. He shares his perspective as senior director of healthcare analytics at NextHealth and former Kaiser employee on the importance of streamlining data definitions—and many other helpful insights.

Jason Kolaczkowski: But rather than sort of be the good steward and try and anticipate the problems and make sure that the business never feels any pain from those problems, I found eventually it’s sort of the opposite—that you have to expose the business to the pain of those problems to create the level of urgency required to get them to participate. 

Ginette Methot: I’m Ginette,

Curtis Seare: and I’m Curtis,

Ginette: and you are listening to Data Crunch,

Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.

Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.

Jason: I actually have a humanities degree. My bachelor’s is in political science where I focused in political theory. So I wasn’t even on the, the hard statistical side of that discipline. And then got a master’s degree in public management, so basically an MBA for government, which is a lot more pragmatic than the theory degree. And just so happened that a professor took me under my wing. He wasn’t my formal advisor. He was just a professor that we sort of latched onto each other, and he came out of the Princeton school of thought around econometric analysis. And so I got into econometrics and econometric study of policy analysis and eventually was a graduate level teaching assistant. So teaching statistics to graduate students and that sent me down the rabbit hole. I got excited about this sort of marriage that I had formed. Again somewhat organically. I’d love to say it was intentional, but it wasn’t around sort of here’s what government I think is supposed to work like and that was theory and here’s how it actually works and here’s the practice.

And that got me into that sort of more scientific methodology of problem solving where you’re trying to piece together either experiments or creating synthetic experiments with data in order to figure out what, what exactly is working. And I loved the ambiguity of that. Do stand your ground laws work well, I don’t know what does work mean. And so now I’m kind of in that theory space. What is success of a stand your ground law. Now we’re talking about values and what are your near civic disciplines and that kind of thing. And then you’ve got to go figure out how to answer that question using data. And so that sort of holistic approach to problem solving is what started me down this path for doing some consulting in the beltway. Really strategic level business process consulting. Spent a little bit of time running my own company and got a very well versed in the need for data driven based decisions.

When when, uh, your personal ability to put bread on the table is reliant on the quality of your decisions. Even though, even though it was just a small company and it wasn’t even related to healthcare is actually in publishing, but then the economy turn 2007, not sure, paper publishing and small business was what I wanted to do given where the economy was going by policy degree, looked pretty good to government agencies. And I landed at the Colorado Medicaid agency, uh, the department health care policy and financing in their budget office as an analyst. And so then I was applying those econometric principles directly to healthcare. And I’ve been in healthcare since after three years at state Medicaid, eight years, Kaiser Permanente Colorado where I ran their payer side analytics at the end of my tenure and then ended up at, uh, next health technologies, uh, where I’m the senior director of health care analytics and ultimately sort of responsible for listening to the market and listening to our clients and figuring out, you know, what is the future direction of, uh, our platform, our product that we bring to the healthcare analytics space. I probably doesn’t hurt that. I in some sense would have been a potential client. I would have been the guy across the table at Kaiser, you know, vetting whether or not next off technologies was a good fit for us. So I think I can bring a bit of a empathy to our, to our company from that perspective. That’s interesting.

Curtis: That’s quite the journey. That’s awesome. And so tell me about, um, somebody tell me about the company and what you guys do.

Jason: Uh, NextHealth technologies is in the business of optimizing interventions in order to improve the affordability of healthcare. We’re, we’re an analytics platform that really takes on the triple discipline of, you know, targeting measurement and optimization. And so what we’re doing, and I’m sure we’ll get in this into more detail in an environment where you often don’t have the luxury of doing a randomized control trial. It’s not like we’re on the R and D side of pharmaceuticals or something like that. You know, you can’t withhold breast cancer screenings to half your population to see if your intervention is working. Right. We’re in, we’re in the business of boxing statistical best practice and making sure that people can create appropriate control groups. Apply rigorous, yep. Not earth shattering, I’ll be honest with you, but rigorous statistical methodologies that allow you to answer with a higher level of certainty whether or not you think your intervention is actually causal.

You know, obviously causality is the gold standard and in some ways impossible to get to. But uh, that is, that is our world, you know, create an intervention into the platform. And if you don’t know where to create that intervention, we’ve got some, some targeting machine learning that allows you to do cluster analysis and find populations that you might want to target per se. You know, an ER reduction intervention trying to divert people to urgent care. He run that study, we receive a feedback, we proclaimed data to figure out, for whom is it working and for whom is it not working? We have an AI engine inside the platform that allows you to optimize based on that feedback loop. Uh, and then you can retarget and you get to find her. And find our grains of people for whom your intervention, you know, sort of your, your, uh, frequency and content and modality of your messages working and for whom it’s not.

So that the idea being that, you know, we’ve got all this waste in our healthcare system, you know, at 18% of GDP, you know, a lot of that is because we’ve got, you know, 20 to 40 to 60% waste just in these small things. Yeah. There’s the, the, the big, heavy, you know, the end of life care problems to solve. But also, you know, if you oversize every single intervention that you do in order to cast a wide net, you’re creating waste in the system. And so, uh, now we’re interested in making sure that death by a thousand paper cuts is a thing of the past.

Curtis: That’s really interesting. And that it sounds like, I mean, you have, there’s a lot of stuff to dig into there. And what you were talking about. So maybe we can start with, uh, so you’re in kind of a leadership role there and, uh, to get this done right, this requires a lot of data. It requires a lot of data strategy. Uh, and especially in healthcare. I’m, I’m assuming there’s regulations you have to deal with, you know, and making sure that the data is in a usable format and everybody’s on board with, you know, what the data means and of these kinds of things. How do you approach that and make all of that work?

Jason: Well, it’s interesting from a vendor perspective, I’ll answer this like kind of two fold. I’ll talk about from my Kaiser experience, but also from my experience here at NextHealth because they relate, right? So that’s from a vendor perspective, we’re working with multiple clients and those clients have their own, you know, data governance issues and opportunities as well as successes. Right? And the trick from a vendor perspective is that not all of those clients, in fact, by definition, all of those clients have slightly different amalgamation of successes and opportunities and misses, right? So it’s not like this client A has been successful in this sphere and client B has been successful in the same sphere. They probably haven’t. And so you almost, you have this variety of places where the data is clean and ready to go, uh, and able to be normalized and places where it’s not because you’re really trying to get it across multiple clients.

Um, that is, you know, not going back to the Kaiser side. It’s a function of that, of being one of those, in this case, payers and trying to figure out how to get data governance to be, you know, a core competency embedded into your DNA, uh, at your organization.

So I’ll circle back around to the vendor question in a second, but linger on Kaiser for a moment. When I was at Kaiser, you know, obviously the, you know, the data scientists, the analytics teams and the it shops, they lived the pain of poor data governance on a day to day basis. And so getting the business to take seriously their role in, you know, say something like data definitions, right? What is a member is actually that, that might sound like a trick question, but there’s, there’s reasons to have multiple definitions of a member is a member a subscription, is a member an individual human with a heartbeat is a member.

Um, you know, the term of the piece of paper, right? And what I’m getting at is that you can have dual eligibles, right? So is it, is it the actual policy? There’s reasons to count members in different ways. And so you’ve got to bring the business into those conversations in order to articulate which definition is appropriate for the type of question that you’re trying to answer. And so in those situations, right, it might be like, okay, well the classic consulting idea of like, identify the problem gather your stakeholders, solve it. Uh, I actually found was missing a step. And really what you needed to do is expose the problem and that feels like a lengthy process. The business wants to move fast, but rather than sort of be the good steward and try and anticipate the problems and make sure that the business never feels any pain from those problems, I found eventually is sort of the opposite.

That you have to expose the business to the pain of those problems to create the level of urgency required to get them to participate. Because let’s face it, data definition, you need to keep running with that example as a slice of governance, data definition, conversations are boring and they’re really not. They’re really not super exciting and said, how you going to participate? Well, the way you get them to participate is sort of a two pronged. Here’s an answer to your question before we sort of defined what member was or here’s three answers to your question. Either we got the one wrong or we presented all three and now you’re forced to choose and then you get into this sort of reconciliation effort. How come up the MIS, uh, analysis is producing a different result in that, that analysis, a different result in that analysis.

Um, I just need to go tell my boss whether or not we’re, you know, improving our, our membership renewal rates. Well, it’s because we haven’t defined this. And so now you need to come participate. So as soon as you bring the problem to bear on the business in a way that it makes sense and it cuts through the technological jargon of it. Now we’re going to have a joint application development session. We’re going to run a JAD where, yeah, we’re not building an application, but you know, we’re gonna bring business in it together. We’re gonna use all the same principles, you know, either they’re all, what are you talking about, right? It’s like, well, no, there’s like we produced an answer that says we have this many subscriptions, but also this many belly buttons, which one do you want to present and why?

And are you going to want to do this in the future? And so rather than talk about it ahead of time, give them something to react to that forces them to make a decision. That was sort of the broad brush, but that’s sort of generally path. But then circling that back to, you know, vendors in the vendor relationships. So you’ve got these different flavors of maturity and areas of success across all of these. These different clients who, uh, may or may not have solved the same problems. It really becomes about normalization. And can you create a standard of data input that is achievable, cross multiple clients. So is there enough, um, you’re building a framework rather than a structure. So is your data framework flexible enough that people can transmit the data that are required in order to get their, their, um, data running through our platform?

And then it doesn’t have the flexibility to deal with changes, right? Provider groups by other provider groups. And so tin is change. Uh, tax IDs change. Um, you know, ICD nine and this went to ICD 10 in 2015 and right. So do you have the right reference tables that allow you to update the reference table rather than the core of your code? Um, those are the types of problems that you have to solve when you’re dealing with, sure. They might all be ice cream and they might even all be vanilla, but there are 27 different flavors of vanilla. Vanilla. One’s got cherry and the other’s French and the others, you know, high cream and they’re all slightly different. So you’ve got to build in a schema, um, that can be a bit more ubiquitous.

Curtis: Yeah, that’s, that’s a lot. And, uh, he had been through that process several times, it sounds like. Um, so, so once, once you do that, you know, and, and you’re, you’re working with that with clients, I imagine on an ongoing basis, but once the data is in a good enough spot, you mentioned several things that you do. You’re using AI and machine learning scientific methods, right? To find these cost savings and these kinds of things. Could you just maybe take us through an example of that to give us sort of a concrete understanding of, you know, what, what methods are you using? You know, how accurate are they? What’s the difficulty level and trying to tackle these problems.

Jason: Sure, sure. Um, let’s take, um, let’s run with the same example I talked about earlier. Let’s say we’re running a an ER reduction program. We’re trying to divert people to the urgent care and we’re to get avoidable ER visits down, which is good from a quality but also good from a cost perspective, right? It’s also probably good from a service perspective if you can direct people to appropriate venues of care, um, and save them some money. Sure. So the first thing you got to do is you got to understand, um, what independent variables, you know, I’ll get into some statistical jargon here. Um, low levels, disco jargon, but statistical jargon nonetheless. Um, you need, uh, you know, what are the independent variables that matter? And to your point, you know, the, this is one of those places where actually big data becomes an asset rather than a liability, right?

Big data is a liability when you’re trying to figure out how to ingest it all, how to structure it all. But once you’ve solved that problem, as hard as that is, now you’ve got an incredibly diverse and rich set of data by which to do data exploration. So what attributes, which independent variables that describe these potential membership cohorts? Really matter are predictive of whether or not someone’s going to show up in the ER. That’s the first thing is you’re trying to figure out which attributes are important. Uh, once you’ve done that, you can do that through various, you know, like clustering analysis and uh, you know, random forest models to create clusters and all those types of things. And I’ll intentionally be a little bit vague around which specific, um, methodologies we use. Um, cause it’s a little bit of secret sauce. But, um, once you’ve done that, once you’ve got the attributes nailed down, uh, then the best practice, right, is propensity score matching.

You’ve got to figure out how to create, um, oftentimes a synthetic control group. Uh, again, because in our highly regulated industry is often the case that you do not have the luxury of truly doing a go forward, randomized controlled trial. And again, I’ll use that breast cancer screening thing, right? So it might be, yes, you’re not going to withhold breast cancer screenings to half the population, but is potentially as UBS to withhold a, you know, a nudging methodology that gets, um, a high risk portion of your population to go get breast cancer screening. You know, with hold that back and only give it to half your population to see if it’s improving rates. There’s a little bit of that. No do no harm. And so if you think this is going to be a good thing, um, you know, your metrics and what have you are going to require that you, that you really push forward.

And so oftentimes you don’t have the luxury of creating a control group through a randomized controlled trial. And that’s where prevent forensic score matching comes in. And we actually have a two prong process of doing propensity score matching across all of these attributes that we found earlier. Um, that allows us to get, you know, highly accurate well inside the 10% confidence interval that is best practice around attribute matching. So you’ve got now a synthetic control group that um, is your population, um, across all sorts of attributes related to that individual, including, you know, things like geography and all those types of things. Of course, disease, state and all and, and um, benefit structures. But beyond that, and now you’ve got, um, a way to measure based on very um, well understood, um, you know, not, not the new cutting edge Bengals statistical methods. You’ve now got to trial and you’ve got to control and you’re just measuring the difference between those lifts.

Um, using whatever the appropriate methodology is. You know, if you’ve got zero, um, no ability not to go below zero, then you might be getting into Basie and analytics. If you, if you have to believe that there’s a normalized curve for whatever the dependent variable is that you’re measuring, then you sure you can just do Plano liquor, logistic regression. That’s the, you know, that part isn’t the secret part. It’s the, can you get the attributes nailed down in terms of importance? Can you get the propensity score matching done really well so that you’ve got synthetic controls and then you measure the trick after that is how fast can you pull the feedback loop in on the granular member by member basis. Um, and we do that well by the way that we structure the data on the ingestion, being able to, to track that member, um, across their clinical or insurance journey so that we know that they did go to the ER or they didn’t, they did go to the ed over a time horizon or they didn’t do anything over that time horizon.

And get that feedback loop built in so that you can optimize in something closer to real time. Um, so that, you know, the cost, the time value of money of not making a decision around whether to cut this program or keep it or to optimize it in a certain way, uh, by doing, you know, more, uh, digital, um, urgent care transmissions. Like, here’s a map to your phone, you know, urgent cares around you given to your phone versus, you know, a phone call at say, making those types of optimization decisions that that has to do with the speed by which you can refresh your platform and look at results and understand at the member ID level, at the individual level who’s doing what. Uh, and then so that becomes sort of the backend of that method.

Curtis: Got it. That’s awesome. That’s really interesting. So, so what, um, I mean, we’ve also talked about, you know, there’s this data governance piece and then the actual modeling statistical piece and then there’s the problem of, you know, how do you, and you, you alluded to this and talk a little bit about it, but how do you actually get, uh, the insights and the, the act, the actions that need to happen? How do you get that to people and help them take those actions? Right.

Jason: Yeah, and I think this is a broader discipline problem. It even breaks industry when you think of the discipline of data science and analytics that you could apply to financial services sector or to healthcare or to logistics, you know, ups and FedEx or airlines or whatever it is. I actually think this is a fairly ubiquitous problem. It starts with being able to present data visually in a way that tells a story. And then it has to do with having the right folks accessing that data, whether that be in our case, you know, internal to the client or on our own client services team who can interpret that information and tell a compelling story. That’s what it comes down to, right. My opinion is that the more that machine learning and artificial intelligence start to, um, not only take over the computation, which happened a long time ago, but actually select the methodologies, which is what’s happening now.

The role of the data scientist in a, in an analytic, um, interpreted capacity means being able to speak to the business. And so what’s left is the strategic level conversation. And so to me, the future of the data scientist isn’t in, um, you know, being able to make these methodological selections, you need to know what they are. You need to be able to quality check what you’re working on. You know, if you’re working on your H2O AI, you’ve gotta be able to understand what are the input outputs to that particular, uh, package. But beyond that, then what’s left becomes more and more strategic. The questions that you can’t answer without running it against the values of your business. So for example, this is a trite example from my, um, political science days, but to illustrate, right? So let’s say that you are running for Congress because you’re some sort of masochist and you’ve decided and you go, okay, so I want to put a poll in the field and you know, get a sense of my potential constituents, whether or not they agree with my education policies.

And what I get back is, you know, 45% of the likely to vote agree with my policy and 55% disagree with my policy. And this is a very simple example because really what happens is 25% don’t care anyway. So you’ve got the 45, four and 55 against. I can’t tell you what to do strategically because what that comes down to is you believe that you’re supposed to be a representative, in which case you should change your policy. Or do you believe that you should be a leader, in which case you should go out and try and convince your constituents that policy is correct. There’s a value based judgment under what you’re trying to accomplish. That data stops out. And so what is left for the data scientist is to take this output which needs to be visual, which needs to be easy to interpret, um, and then convey that in terms of business value.

And sometimes that’s as simple as ROI, but sometimes it’s far more complicated and it has to do with tradeoffs around things that the business conceptually at least equally values. We value representing our people and we value leading our people. So how do I choose between them in this particular circumstance? That’s the human component that I don’t see going away. Um, in fact, I see it getting emphasized as the more technical components get taken over by machines. Um, so that’s a bit esoteric, but so to bring it back to, you know, sort of the, the brass tax of it, you know, what we do then is while we sell software as a service, we also have a very robust client services team that we bring alongside every implementation. And that’s to make sure, yes, that you’re using the tool to the most effect, but also that we’re asking you those questions as a client. Okay. So you want to measure 60 different interventions in our platform. Why, what is the outcome? Like, I understand that you’re wanting to get a baseline of whether or not these programs are working, but back to my initial conversation about how I got into analytics, what does working mean? What is success to you? Without that context, I can’t give an interpretive message that gives you strategic insight that allows you to make a decision.

Curtis: That’s right. Um, and, uh, so do you, have you seen, I mean, this is a huge problem across, as you’ve noted many industries. Um, how do you see, cause this, this is, you know, a classic example of hard skill versus soft skill, right? Like, how do you, how do you train people or teach people to do this? Well, have you guys had success in that? You haven’t, you have a team that it sounds like, does this, what kind of training and things do they go through to be able to do this?  

Jason: Well, um, we find that we pull a lot from, um, PhDs in social science disciplines, psychology, um, the more statistical side of political science or public management. Um, you know, not, not me, I terminated a master’s degree. There are, people are better at this than I am, um, you know, sociology, any, any of those social sciences because they have to live in that space of ambiguity. Um, unlike someone who’s measuring the results of a chemical reaction. Now actually, you know, we do have a data scientist who comes with a PhD in chemistry out of Stanford. Um, but the types of research questions that he has been asking are more in that ambiguous space. And so it’s that comfort with ambiguity that, um, to me, and this is just my opinion, uh, has been sort of the defining characteristic of finding those very um, savvy people with a wide range and tool belt of technical skills, but understand that, um, there is still art and the, the trick is they know when to apply the art and when to apply the science and the science is taking up more and more of it.

And I think that’s a good thing. I think empirical based decision making is a good thing. Um, but as that pie for art shrinks, the importance of using that art, um, when it’s appropriate it becomes more important. And so to me it’s about, um, finding the right people, which is the part of the question I just answered. But then also exposing them to the right conversations, put them in the room where the C suite is asking the difficult questions that are hard to answer without bringing in soft skills. Um, get them to be the fly on the wall, get them to understand why it’s hard to execute on a database decision. It’s not because people are stupid and it’s not because they don’t want to. It’s because the trade offs are really difficult. Uh, and so the more that you can expose people to those things and then have the debriefs about, you know, how could we have, uh, interjected in this space, um, you know, give people the opportunity. Right. So it’s the, it’s the education, the opportunity and the experiences. If you get, if you can cover those three bases, um, you’ve got a fighting chance, that’s for sure.

Curtis: That’s interesting. I know we’re just a little bit past time. I’m just curious though, your perspective on this. Do you see this as, as in the future, is this going to be something where, you know, it’s, it’ll be better to have two different people sort of fulfill both roles, the technical and then the soft skill portion? Or do you think it’s important for both of those skills to reside in the data scientist role?

Jason: Um, I think there is specialization that’s appropriate. We certainly have it right. So our data science team, um, is a sub team within our overall product organization separate from engineering. Right? So there still is that a hard coding, making sure that the right statistical methods are getting programmed correctly into the platform. So the platform does what it’s supposed to do with that flexibility that was talking about. And that’s a very different skill. Um, but from that sort of outfacing client facing, you know, asking the next level of question data scientist, I actually think, um, having both of those embodied in the same person and then organizing your operation so that you don’t need an army of them, that the engineering team is really good at covering, you know, 80% of the football field. Uh, and then maybe your account managers on the other end of that continuum are covering their 80% in terms of relationship management so that your data scientists have just a little bit of both.

Like, so they, they, they don’t have to cover entire field in both directions. Uh, in terms of all of the technical skills and all of the client services work. So they can focus on that place exactly where the handoff happened. I think about the analogy of attract me, if you’ve ever watched a relay race, right? Sure. You know what happens when they did the two people on the same team come to each other, they start running together. The hard handoffs slow you down. And so what your data scientist is there to do is to actually lengthen the handoff. And in this world of like, you know, lean thinking and all that type of stuff, it might be a little bit counterintuitive. Lengthening the handoff is a good thing. Lengthening that the handoff, which is what that data scientist does between technical and the social side of it, the client management side of it reduces the risk that something is going to be dropped.

And what ultimately loses you the race is whether or not you drop that Baton. If you drop that Baton, you’re out. Everyone else has run past you. Um, and so by lengthening that handoff, uh, just at that point where technical meets service, I think that’s where the focus can be. They can’t do everything, but can you have a limited number of people who are those magic unicorns and set up your operations so that you only need a limited number of people cause you aren’t asking them to do the end to end. You’re asking them to manage that space where the two disciplines touch.



“Loopster” Kevin MacLeod (incompetech.com)

Licensed under Creative Commons: By Attribution 3.0 License