The back of a truck on the road.

Last-Mile Logistics Analytics—for Everyone Who Isn’t Amazon

Today we speak with Professor Ram Bala, an expert in supply chain management analytics, particularly last-mile delivery. He has very interesting insights into how today’s supply chain is evolving. He talks about various methods and algorithms he uses, the specific challenges inherent in doing last mile logistics and deliver, how pricing factors in, and how everyone is trying to catch up to Amazon.

Ram Bala: Then there is this great opportunity to actually use the data effectively. But that is a long way to go in terms of coming up with the right algorithms, both on predictions, as well as the optimization to actually get this done in a meaningful way. And if you look at the landscape today in terms of industry, I would say very few companies that actually there yet. Right? I mean, Amazon obviously is a clear example of the leaders in the space, but everyone’s trying to get there as well.

Ginette Methot: I’m Ginette

Curtis Seare: And I’m Curtis

Ginette: And you are listening to Data Crunch

Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world.

Intro: Today we speak with Professor Ram Bala, an expert in supply chain management analytics, particularly last-mile delivery. He has very interesting insights into how today’s supply chain is evolving.

Ram Bala: My name is Ram Bala. I’m a professor at Santa Clara University as well as a data science leader at CH Robinson, which is the largest logistics marketplace in North America. I’ve been working with topics in supply chain, belated data science even before it was called data science for the past 15 years. I got my Ph.D. in operations research and a supply chain from UCLA and a, I’ve been working on these problems both for companies as well as within the academic context that I’ve been working on research problems. And more recently I think there’s been a lot of excitement in this space. And then that’s where my involvement with both startups and as well as larger companies has gone up and I, I came into the CH Robinson fold as a consequence of an acquisition. So I was part of a startup that was working on last mile logistics and how to, how to improve that.

Curtis Seare: Got It. That’s awesome. And the space that you’re in is really interesting. Could you give the audience just to contextualize the problem set that you’re focused on?

Ram: So I think one of the major things that has changed in logistics is the growth of e-commerce and also personal mobility. I mean if you think about Uber Logistics as a larger concept that covers both moving people as well as products and what’s really happened is the, the availability of real time data has had a significant consequences on how we are able to predict as well as optimize how we move things and that’s then also raised the bar in terms of customer expectations. We expect to get a get a ride to go somewhere within and within five minutes, we expect to get a product within a day and those expectations have been set by specific companies say Uber in the case of personal mobility. In the case of products, it’s Amazon and having set the stage, everyone’s now trying to be competitive with them, which means that in the product space, certainly all e-commerce companies as well as companies that were in brick and mortar are trying to achieve that same end goal, which is how do I get products to consumers quickly at the same time and not spend too much money? Right? That’s the core problem. Now doing that as hard, it’s become easier simply because we have real time access to real time data in terms of location as well as you know where products are at an even point. But it is a hard problem to solve.

Curtis: Some of the intricacy and you know, routing and pricing and kind of interplay there. Can we dive into a little bit of those details?

Ram: Absolutely. So I think uh, routing problems have been around ever since transportation’s been around, and the core trade offs, everyone is aware of complex routes are more expensive, that’s the cost side. Customers are willing to pay for faster service, that’s the pricing side. And how you exactly said prices is an interplay of those two. We also understand that if you’re moving a vehicle from point A to point B, you need to utilize it to the maximum extent possible. You need to have as many people or products in it to ensure that you are minimizing waste. Now why we know those as concepts? Historically, it’s been very hard to actually optimize price costs and minimize waste effectively simply because the data you had was static. You had data available at frequencies of maybe a quarter, maybe a year and often you had to rely on making those decisions based on static data.

Ram: To also be quite imprecise. Right? All that has changed as I mentioned earlier with the availability of of real time data on where a track might be? Where a product might be? And where the consumer might be. Could be is not have to be located at home, could be the office, it could be a business consumer with a registered address and eventually we’re going to move to a phase where you don’t even have to be located at a specific address using location services. I could be in a parking lot, in a somewhere and I could decide that I want, I want something delivered to me and shippers will figure out a way of actually getting that to the, to the consumer in a reasonable amount of time.

Curtis: That’s interesting.

Ram: Yeah. And so what this means is that, no, this is obviously complex because obviously to do this it’s going to be a, it requires you to understand the landscape in terms of: What routes are possible? Where is the inventory located? You know, how do I move things around? And do that in a cost effective way. So the good news is there’s a lot of data available. Then there is this great opportunity to actually use the data effectively. But that is a long way to go in terms of coming up with the right algorithms, both on predictions, as well as the optimization to actually get this done in a meaningful way. And if you look at the landscape today in terms of industry, I would say very few companies that actually there yet. Right? I mean, Amazon obviously is a clear example of the leader in the space, but everyone’s trying to get there as well.

Curtis: Got It. Talking about those methods, you had mentioned some of these methods are more traditional methods like Bayesian methods versus maybe a deep learning approach. Can you talk a little bit about what’s working in the space in which you guys are, are playing around with?

Ram: Right. Actually I’d mentioned Bayesian methods as being more modern in that sense. I’ll tell you why, and deep learning of course is used as well. I think one of the things was traditional methods in terms of prediction is that traditional methods of production are concerned about averages. What is the average time it takes to get somewhere? I think the big change now is in any kind of delivery, you’re also concerned about what the variants around that averages, right? It takes on average about three days to go, to to go somewhere, but it turns out that that’s only an average number. 30% of time it takes four days. Is that acceptable? Maybe not. Right? So which then means that I need to understand exactly what’s happening with the distribution around that average. I also need to understand that these distributions can be long tail. Not everything is normally distributed which means we need to use a mix of methods to actually get there some old fashioned methods but also apply.

Ram: When I say Bayesian methods, often you are, you are looking at your, you’re trying to understand how this radians might look based on some reference points that you have, prior reference points you might have and be able to use that information intelligently. Obviously, deep learning is something that will come into play as well. But I think it’s a, it’s really a mix of methods which are focused on understanding the true shape of the demand at the time distribution and being able to use that information, factor that into service levels. Because we want to, we want to get something to a consumer 98% of the time or 95% of the time, at least not on average, because that that mean number doesn’t mean anything. So I’m on consumer’s view points certainly.

Curtis: That’s interesting. So, you guys you know how do this. Are you writing your own algorithms and things like this or is this something where you can take work that’s been done in open source and tweak the models or what? What does that look like?

Ram: Ah, so it’s a combination. So I think, I do believe that open source models do have their value, but I think often I do have to put them together in ways that are different. Sometimes I might decide that, you know, the open source methods aren’t a, it’s too familiar truly capture, let’s say the time distribution using the open source method and doing that is, is too labor intensive. So, uh, I might decide that, I know that’s the way to go is to just write this from scratch on it all. So I think it’s, it’s, it’s a mix and match strategy. I think what works best in a, in a particular context, but I think relying on open source methods entirely would not be a good idea because oftentimes it doesn’t necessarily get what you’re looking for. I think it gets you somewhere, but I think it’s not, it doesn’t address the problem the way you would want it addressed, which is why I believe that in terms of skill set, I think it’s important to kind of understand how to, how to actually map the problem domain to the math of all the, or how all this works.

Ram: Because I think sometimes that is the urge to kind of just use open source directly, you know? And just as a recipe book, I mean, that doesn’t always.

Curtis: Sure. Yeah, that’s interesting. And that, what kind of languages are you using and packages and stuff? Just the most important ones that you work with.

Ram: Sure. So I think Python is used a lot in terms of looking at running different algorithms and create or also writing algorithms on the fly. These are personal preferences of course. I think sometimes people, some individuals prefer to use Od eventually when we deploy. It could, it could go either way. I think if you’re deploying, let’s say a system for real time use often I think we tend to go in in the Python direction and you know, deploy that on a, on a big data platform. But I think, I suppose if you’re using it, we are creating tools that a manager might just just use the dashboard. Then there is a preference towards let’s say using who like shiny for example an ar and build that around the core algorithm. So I would say that it really depends on context. I think very real time systems where the algorithms are deployed in real time. I think Python tends to be used a lot more. I think our for a lot of a lot of more managerial dashboards.

Curtis: You had mentioned also that the the last mile of this problem, last mile routing is one of the hardest things to do. Can you give us some context around why that is so difficult?

Ram: Right, goes back to the classic trade offs in transportation. Now if you think about the last mile, if you’re transporting products to consumers in their homes for instance, they’re very fragmented, which means that the number of touch points that you have to actually meet is quite high, but transportation thrives on economies a scale, which means that you really want to fill up a van completely, ideally speaking and be able to take that van to a to transport it from point A to point B and unload that van. Ideally, It just point B or minimize the number of stops. Every stop you add is going to add to the cost. The second thing is you may not necessarily have that mini stops, nor in a particular geography at a particular given time as well. So we are not, we’re not just worried about geographies. We’re also concerned about time windows.

Ram: So if you’re looking at a time window between 2:00 and 3:00 PM and looking at a particular zip code, your ideal scenario would be to have lots of demand that you can fill up a van and you can move that van to that zip code between 2:00 and 3:00 PM and unload everything and have multiple stops. That doesn’t quite happen, right? Because obviously demand tends to be highly variable across different geographies. That density of demand is not necessarily the same everywhere. You also don’t, do not mentally have it at the same time, right? So the key then is to, is to really identify those patterns, right? Where where those demand spikes actually occur and tailor your entire delivery network to those spikes effectively. Right? When those spikes don’t happen, you kind of have a different challenge, which means that you still have to sell the cons consumer, which means that maybe I need to have my inventory placed differently.

Ram: So, which means you also are thinking, you know, I’m not just going to move it from let’s say the warehouse to the zip code between two and three. I really want to have inventory placed off for certain products, much, much closer to the consumer and be able to move those things faster as and when required. And so then that leads to this really complex problem of how do I a Identify these demand patterns, match my delivery network to it. And in fact also match your match, your entire inventory placement of that, right? So which means it also now becomes an inventory problem. So which units of which product should I, should I store stuck? Uh, let’s say within 10 miles versus let’s say within 30 miles. And so what you’re really seeing, if you, if you think about e-commerce and last mile is this constant push to talk, placing inventory closer and closer to the end consumer. But you cannot place all products there. So you need to kind of pick and choose depending on demand patterns. So you’re going to start seeing not just big warehouses across, outside of major radius, but you’re also going to start seeing a smaller stocking points instead of retail stores. You’re gonna have stocking points which which are going to emerge, let’s say in different areas, right, which are gonna be smaller, but places where you can actually pull inventory out very quickly.

Curtis: You’ve mentioned Amazon a little bit and the fact that now because there’s set the bar, all these other companies have to measure up to this, but the economics are difficult. They’re right. They don’t have the resources Amazon has or these warehouses that they’re building everywhere. How do they, how do they compete?

Ram: That’s a great question and I think I always say that I considered Amazon to be the equivalent to the iPhone 40 commerce. Got It. Where, what that I mean by that is that Amazon wants to control every aspect of its supply chain. There was a time, about eight years ago, not too long ago when Amazon did not even do not have any investment in transportation of any kind. They would completely depend on FedEx, UPS and other other USPX and so on. The move that product from their warehouses to the end consumers. Today, come 2019 they are, they have about 40 planes in Amazon air primarily. They are trying to buy like 20,000 Mercedes-Benz Vans. They are setting up delivery partners who will actually drive these vans. They of course owned their own warehouses and then, so they can decide where to put a warehouse that they will make that decision.

So it’s a very, a centrally orchestrated, it’s moving towards, it wasn’t in the you know, they usually depend on external delivery per date, completely in our out of FedEx. Right. So they don’t use FedEx at all to to move anything. And eventually they’re going to rely almost entirely on their own, their own transportation network to move stuff. The reason they’re doing that is because this gives, this gives them incredible amounts of control or that entire network and they can really design and optimize the network effectively. Now the question that you’re asked is how does a small ecommerce player compete with that? Because you’re talking about about the muscle of Amazon and the infrastructure. I would say the answer with the iPhone is the Android, right? So the, the equivalent of that would be the android of ecommerce. What do I mean by that? What I mean by that is, it has to be a collaborative play across different players in the ecosystem.

Now there are some like the Walmarts and Targets of the world which are, which are obviously going to be able to put together infrastructure on their own, but if you look at anyone outside of the top three or four retailers, I think they’re not going to be able to do the same thing. What they’re going to have to do is leverage third party logistics and third party warehouses, right? And you’re already seeing some Walmart in the space that are companies like Flexi, with base on Seattle basically is a warehouse coordinator. They call themselves Uber for warehousing. The idea being that what’s the best way to get two day delivery, have a warehouse close to the consumer. If you’re a small e-commerce player, you’re not going to be able to build a network of warehouses. What do you rely on? Do you rely third party warehouses, but how do I get there?

I get there, you know, get the product through through apps and tools that are, give me the visibility of which third party warehouse that can actually place my product. Right? The same thing goes through with carriers as well. How do I leverage an entire portfolio of carriers? Let’s talk about not just FedEx and UPS, regional carriers like OnTrac and also maybe even last mile players like DoorDash, who are all getting into sort of the the local delivery space. Like how do you leverage all of those and sort of have, you get into develop partnerships with them but you don’t tell them directly. You have like middleman logistics players who are increasingly increasingly getting more tech savvy. Three years CH Robinson. The company I currently work for is moving in that direction as well. You know, creating apps that can actually sort of coordinate between different, different parties and so essentially you, you don’t build a network on your own, you just, you just, you know, you just leverage the, the different third party networks. The key here again is, is technology and data and being able to orchestrate all of that.

Curtis: Putting that all together with some sort of glue rights, it all works. Yeah, absolutely. Yeah, that’s, that’s a huge, huge undertaking. But a, that’s interesting. Now we’ve talked a little bit about sort of the economics here and the space and things like this and a little bit about the models that you’re using. How accurate are these methods that is there still a big push to be more accurate or are we, you know, are we good enough? You know, and when we order something from Amazon, usually it comes on time, sometimes it doesn’t. But is that telling that the algorithms are getting to the place where they’re good enough to get at the, uh, the business needs of these problems or there still a long way to go?

Ram: I would say the algorithms are getting fairly good at this point. The cost you pay for not being good enough is basically having to maybe have inventory closer located even closer to the consumer or in I don’t know or have transportation modes, which are, which are faster. So these are all you know costs. And so obviously the price of inaccuracy is, is a higher cost I guess, but I think the algorithms are, are getting there. I think the issue really hasn’t been in the current state of algorithms that we have is is not, is is they’re pretty good at predicting in the data of this kind. The issue really has is not been a the state of algorithms as much as the availability of data to be honest, I really think that, in many business problems I find that it’s not the accuracy, the algorithms. I mean I think when we talk about, when you talk about let’s say image processing, you know identifying whether an image is that of a cat or dog, I think that’s where the accuracy gains have really had an impact.

Right? I mean I think using more deep learning and has it had a big impact. If you look at the traditional industries, when I say traditional, I mean moving product as a very traditional industry. We’re doing it in very innovative ways, but I think if you look at it, the real reason why innovation has styled is really the lack of availability of, of, of data. It’s not the al, I don’t see algorithms as a big bottleneck to be honest. And, and even today I would say the greater, the greatest challenges I face are not the oblige of run the algorithm. It’s really the algorithms exist. It’s really about not having the data in the right place, not having the data at the right cadence, not having the data at the, at the, at the, um, of the right kind. I mean the a, because simply because it wasn’t collected that way and not because it’s not there, it just don’t, it wasn’t collected in that I way the system was set up, let’s say 10 years ago, then, you know, then people had a different type process and so on. Right. And I think I’d, I would say that that would be a bigger challenge when it comes to supply chains and gender.

Curtis: Got It. So if you’re a business trying to nail this is the bigger problem then a data pipeline issue and maybe management of how you collect and what you choose to collect and these kinds of questions?

Ram: Absolutely. I would say that the number one challenge for anyone wanting to do this right would be to, would be the look at the end goal and work backwards from there and figure it out. So you may not need a lot of data collection to get the operation up and running, but if you were to have some foresight and you want to say, okay, if you’re going to be able to use this data in so many different ways to provide so many different kinds of services, when you start thinking about it from that direction, then you then the kind of data you would need to collect to get there would be would be a larger set or maybe a different kind. And I think that focus on the future is important right from the beginning. Even when you’re putting an operational place. I think when you put an operational place, which with the system, which doesn’t look at that future often, you’re then stuck with a system that is not collecting data. Right? And then upgrading the system is going to take a lot of money. Right. So I think that’s where the real challenge is to be honest. And I think that’s, that’s how that’s it’s full step. I think there’s still some creativity required and being able to stick those algorithms together to execute. But I think an emphasis on collecting the right data should be, is very involved.

Curtis: Yeah, that’s very interesting. It’s, it’s a tough thing to do, right? To sit down and take time to plan it out before you jump in. But like you’re saying, it’s, it’s super important. You’ve worked on this a lot and you know how to effectively approach optimization problems, which is what you’re working on. This gets back to the algorithms, but just from your perspective, how do you approach that problem and what do you think about, how do you set it up?

Ram: I mean we have always approached these problems is a, it’s very, very important to have to understand what the true business problem is and this is usually very hard, you know, as compared to let’s say solving, I give you an example of imaginary mission. Alright, I think it’s there. It’s easy to define the problem. It’s much harder to solve it. Here, it often we don’t even know whether you’re solving that problem and to do that we really need you talking to our business users and consumers I think is critical. Understanding what the real problem is. I think I spent a lot of time just trying to understand that if we implemented a data science or an AI focused approach to solving this problem, what is going to be the other way here? I think, I think it stuck and I, I like to have that business conversation right up front because if you don’t have that then I think we can all run some algorithms and then come up with some outcomes.

Often I find that people create tools, tools use a fancy algorithm, but the business user does not really understand how to use it. So, and then we end up having, you know it, it remains unused. You’re not collecting any of feed data on feedback. So I think it’s very important to, I think before even embarking on something you understand what the business problem is, understand what the other way we wanted to deliver right to the beginning. And then of course when you’re, in terms of framing the optimization problem in the media, I could be very biased here, but because you know I have a Ph.D. Level training, I really think fidelity of the math to the business problem is very important, um, which is why before I even touch any kind of open source packages and start running algorithms on the fly, I think I really come up right to come up with a, with an accurate mathematical representation of what problem you’re trying to solve.

So there’s a business problem, then there’s a mathematical representation and then we get to really sort of, you know, one of the optimize stuff, right? I mean, let’s, let’s do that. Let’s get that right first and uh, understand the business work flow, get the math right and then really start doing the coding. So I think it’s a, that I really like to get to the coding really is a step three because, uh, of course you want to do some code according to do some exploration, obviously with the data. I mean, that’s obviously going to be there, but at the same time, you know, I don’t like building algorithms out in production without actually having understood what we are trying to do.

Extro: A huge thank you to Ram Bala for chatting with us. For attributions, you can head over to datacrunchcorp.com to our shownotes. 

Attributions

Music

“Loopster” Kevin MacLeod (incompetech.com)

Licensed under Creative Commons: By Attribution 3.0 License

http://creativecommons.org/licenses/by/3.0/