Deep Learning, Microwaves, and Bugs

Sometimes AI and deep learning are not only overkill, but also a subpar solution. Learn when to use them and when not. Diego from Northwestern’s Deep Learning Institute discusses practical AI and deep learning in industry. He covers insights on how to train models well, the difference between textbook and real AI problems, and the problem of multiple explanations.

Diego Klabjan: One aspect of the problem it has to have in order to be, to be amenable to AI is complexity, right? So if you have, if you have a nice data with, I don’t know, 20, 30 features that you can quote, put in a spreadsheet, right? So then, then AI is going to be an overkill and it’s actually sort of not, is going to be an overkill. It’s going to be a subpar solution.

Ginette Methot: I’m Ginette,

Curtis Seare: and I’m Curtis,

Ginette: and you are listening to Data Crunch,

Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.

Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.

We’d like to hear what you want to learn on our future podcast episodes, and so we’re running a give away until our next podcast episode comes out. We’re giving away our book Simple Predictive Analytics. All you have to do is go on to LinkedIn and tag The Data Crunch Corporation in a post with your suggestion, and we’ll randomly pick a winner from those who submit. If you win and you’re in the US, we’ll send you a physical copy, and if you’re in another country, we’ll send you an electronic copy. Can’t wait to hear from you.

Today, we chat with Professor Diego Klabjan the director of the Master of Science in Analytics and director of the Deep Learning Lab at Northwestern University.

Diego: My name is Diego Klabjan. So I’m a faculty at Northwestern University in the department of industrial engineering and management sciences. I actually spend my entire career in academia. So I graduated from Georgia tech in ’99, and then I spent six years at the university of Illinois Urbana-Champaign and got my tenure there. And then I was recruited here at Northwestern as a tenured faculty member a year later. So I’m at Northwestern for approximately 14 years. Yeah, so I’m the director of the master of science in analytics, actually founding director of the master of science in analytics, so I established the master’s program back in 2010, and I’m directing it since then. And recently, I also became the director of the center for deep learning, which is a relatively new initiative at Northwestern. Sort of we, we are having discussions for the last year and a half, and about half a year ago, we officially kicked it off with a few founding members.

So my expertise is in machine learning and deep learning. So I have, I run sort of a very big research program. So I advise more than 15 PhD students from a variety of, of departments and the vast majority of them do deep learning research. Yeah, so I started, I started deep learning what was around six, seven years ago. So I was definitely not sort of one of the, one of the early or the earliest faculty members conducting, studying, being attached to deep learning. But I wasn’t that late to the game either. Right. So I still, I still remember approximately six, seven years ago attending deep learning conferences with like 50 attendees, and now, now those conferences are like 5,000 people. Just astonishing.

Curtis: That’s crazy. How you’ve seen that grow.

Diego: Yup. Um, yeah, and I’m also, so the last word is ah, I’m also a founder of OPEX analytics, which is a consulting company. I no longer have much to do with the company, uh, but sort of have experience also on the business side.

Curtis: Great. So this, uh, the deep learning Institute started about a year or two ago, is that right? Did I understand that right?

Diego: Yeah, that’s correct. I mean, so we, we started having discussions around . . . it was about a year and a half ago and we officially launched it about six months ago.

Curtis: That’s awesome. And response, it sounds has been really great. You have 15 PhD students already working for you. How’s it been going?

Diego: It’s progressing very well. So we, we have a few corporations that are already founding members, and we have a long pipeline.

So the center is, it’s supposed to be quite practical in a . . . I should say the plan is to, uh, mostly tackle practical problems through working with several, several companies. And the mission of outlined the mission sort of in layman words is to advance deep learning in corporations, whether those corporations are either at the early stages of the AI journey or have already some substantial experience with AI. So we want to advance that democratization of AI in these companies. We have a few relatively small companies from as members and we have also a few very large corporations. So we, we really cover both sides of the spectrum and, and, and, and, and we are focusing sort of across any industry. And as I already alluded, sort of, the size of the company doesn’t really matter to us.

Curtis: I’d love to dive in and just really get your expertise here. You had mentioned you wanted to talk about a couple topics, one of them being training and things you’ve seen, training models and how you do that well and these kinds of things. So maybe let’s just dive into the topic hear your experience on it.

Diego: So if you think about a typical, uh, pipeline, right? So, and this is not just deep learning, but also machine learning pipeline, consists of training and then so-called model serving, which essentially deploying the model and then monitoring in in operations or, or in production. So training essentially consists of creating a model that quote fits your data in the best way, right? So historical, historical data. So there are several challenges when it comes to training. So I like to distinguish between what I call so-called textbook business problems and everything else. And, and unfortunately or fortunately, it depends on who, which side that you are. Not that many problems in real world are actually textbook problems. So I, most of the problems that, that we deal with are actually not textbook problems.

So a typical textbook problem, for example, would be face recognition. So there’s a lot of writing about face recognition, and yeah face recognition is sort of you have a bunch of images and you want to recognize faces and, yeah, I mean you can pick an quote “an off the shelf software solutions” and, and apply to it. And, and all of the main cloud providers they offer solutions for face recognition. So this is what I would call a textbook problem. But, but as, as any manager would tell you, “Hey, our business is unique and we are not like any other company.” And that to a certain extent, that is true when it comes to AI. So a lot of the problems, a lot of the practical problems, their have their own peculiar aspects and need then the development of tailored model models.

So now those tailored models. They, they can be built on top of existing models and, and actually the vast majority of them are, but you need to, you need to sort of make, make enhancements to those problems. So let me just give you a few concrete examples where, where I’ve encountered this. So you would think, for example, if you, if we discuss quickly, uh, IOT, the IOT space and let’s say medical devices, right? So, and you’re trying now to predict, say some kind of faults of medical devices and you have data that stream from those devices, right? So they, their sensors and, uh, and log files from, from medical devices. And so this, this sounds like a typical textbook type problem, and yet it is, but once you realize that faults on medical devices are very rare, then you end up with a pretty much sort of with with an anomaly detection type problem.

Right? So you have, you have only a few observations of the actual failures of, uh, of devices. And then it’s sort of a, yeah, you can no longer really apply a textbook type problem because those problems assume that you have a lot of quote faults so that you ideally have 50/50% or 50% of the patients that are quote not fault and 50% that are false. But in real world, that’s not the case. And so this, this implies that you have to make your own tailored, changes. Another example that I want to point out is there’s a lot of, there have been a lot of models and studies around image problems where you have several images attached to just one single quote object. So an example.

And here, so I’m going to give you a contrived example because I cannot reveal the actual business application that we worked on. So let’s say that, that you are sort of selling say, uh, microwaves, right? So, uh, and, and you have images. Now you have several images of a microwave and you want a model that automatically assesses potential price of that microwave, right? So a textbook model will tell you, all right, so here you have 10 images, and now let me predict the price. But in reality, sort of, you typically, you know that you have several images from the front of the microwave, several images from inside the microwave, and several images, let’s say, on from the sides of the microwave. So you can group your images, right? And that’s where now, by using these facts, so the fact that you can group images based on the side of the microwave and whether it’s inside or outside of the microwave.

So by using this, you can actually come up with more efficient and accurate models. And this is again, this is again, something where you have to actually build your own model right? So the, the main message that I want to convey here is there are several problems where we’re just standard models work, but there are also many, many additional problems where you have to essentially build on top of your existing models and use deeper expertise in order to actually tackle, in order to tackle a problem.

Curtis: In your experience, you’ve done this a couple times now. What’s the level of effort usually required to do something like that, to take something and modify it to fit a specific use case?

Diego: It does require some amount of efforts. Sort of usually, uh, we’re talking at least sort of a few months of effort and one problem is sort of I alluded that typically you, you take an open source code and then you build on top of it. Now, one problem that we have observed frequently is open source codes sort of have bugs. So you have to, first you have to make sure that the open source code actually works on your data before you actually start enhancing it. Right? So that, that’s one relatively big challenge. And then even if there are no bugs, sort of, we discovered that that documentation quite often is, does not adhere to the actual code, right? So because, so people write code and documentation then put it in a repository and then they make changes to the code but not the documentation. So then you end up with this challenges where, where you read the documentation and you experiment with the open source solution based on the communication and that creates all sorts of, all sorts of problems. So, so there, there are definitely challenges in using open source code and, and, and modifying it.

Curtis: You also have some other things you mentioned are one, once you’ve sort of built it, you have the tuning step and making sure it’s stable and the explainability of deep learning is, is a major issue that I think there’ve been some inroads, but it’s probably not where it needs to be yet. What are your comments about those?

Diego: That’s correct. So there’s, there’s a lot of talking about explainability and here we actually observed something very, very interesting. So when you, when you train, your deep learning model, there’s a lot of uncertainty during training and randomization as well, right? So I’m including randomization as well. And usually, so when you when you train your model, you get one solution and then when you train it, say, even, I know a few days later you’re going to get a different solution. That’s because of the all randomness behind the actual training process. Now, what is interesting is that most of the solutions are going to have similar KPIs or it’s a similar, let’s say standard KPI. So the solution that you end up with, they’re not going to be the same, but they’re going to be very similar in terms of the performance metrics.

But, uh, we faced one problem where the customer wanted, or wants expandability behind the model. And then what we found out is that while the solutions use similar KPIs, they actually don’t offer similar explainability, right? So, which is very intriguing, and I don’t quite have an answer as to why is that the case. But it definitely, and I don’t even have a solution yet, but it’s definitely intriguing that, that two solutions that give roughly same say accuracy when it comes to explainability, they explain the decisions in a completely different, different way. First is this is definitely very, very interesting. And as I said, so we don’t yet have a solution how to cope with this. But hopefully we’ll come up with, uh, with something.

. . .

Ginette: A huge thank you to Diego Klabjan for being on the show, and if you’d like to read the transcript or see any of our attributions, go to datacrunchcorp.com/podcast. And remember to send us your ideas for future episodes by going to LinkedIn and tagging The Data Crunch Corporation in a post. See you next time.

Attributions

Music

“Loopster” Kevin MacLeod (incompetech.com)

Licensed under Creative Commons: By Attribution 3.0 License

http://creativecommons.org/licenses/by/3.0/