Robin Purohit: There’s lots of signals in this data that indicates whether a person or a piece of content is highly valued by the user community. And that’s really invaluable when you are gonna combine that with that kind of domain focus on the IT industry.
Ginette Methot: I’m Ginette,
Curtis Seare: and I’m Curtis,
Ginette: and you are listening to Data Crunch,
Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.
Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics, training, and consulting company.
Today we chat with Robin Purohit, CEO and co-founder of Peritus.ai.
Robin Purohit: I’m Robin Purohit. Born and raised in Canada. Been in the tech industry for three decades, believe it or not. And in the Valley, Silicon Valley, since the late nineties, and in my career, I’ve kind of been an inventor.
I’ve been a product marketeer and growth executive, and I’ve been a globe-trotting corporate executive running very large multi-billion dollar software portfolios. But this is actually my fourth startup.
Curtis: You’ve been through three so far and. I mean, startups are hard. Right? Is it just you, you like the punishment or maybe you figured something out that most people have?
Robin: Yeah. I know. I, I think I I’m always really attracted to think times have changed. Right. So in my professional career, I kind of have an instinct when the market and customer needs are changing really quickly. And technology is starting to mature. If things get really fun, and you, in those days, you want to be part of a startup because you can there’s no rules, and you can kind of take advantage of these massive changes in the market.
If you’re lucky you can be very successful. Yeah.It takes a little bit of luck and timing, but, but I’m also the creative guy. So that’s, that’s where I thrive.
Curtis: You’ve got a long history in the tech sector. I’m assuming all of your startups have been in the tech sector. Can you tell us a little bit about how you watch those trends, what you’ve seen and, how you’re, “oh, yeah. Like I’m going to jump on this, on this wave here and create something.” What, what have you seen, what’s been your experience?
Robin: Yeah, basically the three kind of phases of my career. The first one I focused on conducting things that really high-speed. And then the second phase of my career, I focused on making things, all these new distributed systems scale.
And then the third phase of my career, I focused on making all that stuff work as people needed to professionalize the world of IT. Right. And so in each of those phases, I kind of there’s inflection points that I saw that made it fun to do a startup. So my first one. I kind of left a comfortable career at a big networking company to use fiber optics to connect things together much, much faster than ever before.
And I know that gave birth to the backbone of both Internet, as well as data center communications of very, very high speed. More recently, I’ve been kind of fascinated by data, you know, and how do you help businesses take advantage of data to allow humans to amazing things. And that’s what I really think the enterprise AI market’s going to be about for the next 10 years.
Curtis: As someone who is watching these trends and things like that, what kinds of things are you seeing in data specifically? ‘Cause I love the startup scene and the day of seeing, obviously it is sort of the nature of the podcast. Like what are you, what are you seeing now that is impelling you to this next startup?
Robin: Well, it’s kind of interesting parallel to what we saw about 10 years ago with the big data movement. Right? So big data movement, 10 years ago, there is amazing amount of fast, maturing, open source technologies to enable whole new ways of storing and accessing data for new kinds of applications. And it was happening at a time where I think there was a lot of envy in businesses watching these latest generation of consumer tech companies and the amazing stuff they were doing and trying to create their own data systems that could scale and do similar things.
And so. No, you, and we’ll talk a little bit more. There’s there’s a whole kind of wave of adoption that happens when such inflection points happen, a lot of wasted money, and then things really take take hold and start to scale very quickly. The era we’re in now is that that data infrastructure is largely there. Now it’s very comfortable people put on multiple different kinds of repositories in clouds.
And so now the challenge is how do you connect all that data together and allow you know a human to do something new and something really innovative? I think what’s going to happen is we’ll move away from these siloed workflow applications to some new kind of user experience allows, you know, salesperson, an engineer, or a customer support person to be like 10X, more productive than they are today.
Curtis: How do you imagine that UI working or, or tool working? I know people have dabbled in that, right? Like how do we, how do we get AI or analytics into the workflows of specific people doing specific things? So what’s your, what’s your vision of that?
Robin: Yeah. Well, I think the main thing is to think small. You know, you want to think small before you get overly ambitious. All these technologies, there’s almost a lot of dreams and fantasies of how magical they can be.
And they’re usually used to a lot of over-investment with enterprise AI. What I’ve learned this last couple of years is that you need two things. You get two things, right. One, do you need a lot of data. And that data needs to be, ah, fairly high quality. In order to train models and make predictions. And you shouldn’t underestimate the challenge of doing that.
The second thing you need to do is a lot have a lot of human interaction because the, the magical algorithms are still evolving. Although the technology is pretty amazing, it’s not really where people’s expectations are, where you can do an AI project and all of a sudden everything is automated. It needs a lot of human intervention and human guidance to tell the algorithms whether they’re right wrong, whether they’re getting warmer or cooler.
And only by lot of that human exercise, do things get to the critical point where it becomes useful. So those things you’ve got to get right. You got to find a use case where a lot of people want to use something. And you have to spend a lot of time getting the data to be high quality enough to make predictions.
And in the middle of that, there’s a whole bunch of experimentation, which is where the smart tech companies and engineers are spending their time.
Curtis: Well, let’s talk about that. I know. ‘Cause that’s a struggle. Lots of people go through, particularly the part, the first part you said the quality data, which rightfully shouldn’t be underestimated.
That’s kind of the boring part, but it’s foundational. Right? So I don’t know if you have some experiences or stories or, or maybe just advice or things you’ve seen, but how, how do you approach that part and get it right so that everything else works?
Robin: Yeah, I think, you know, the thing to acknowledge is, is not like the old days of master data management and BI, where things were highly structured.
The data that we need to tap into for AI is kind of everywhere. It’s in very structured systems, probably many generations of structured systems, by the way. It’s in highly unstructured databases that have exploded the last 10 years. And there’s just a lot of content and conversations, right? So what you need to do is to create a data pipeline, to take all those things and have some unifying theory that makes sense of it and connects it together.
And the enabling technology that we use, and a lot of the industry’s using, is called a knowledge graph technology that’s been popularized by Facebook and Google and internet applications. And it’s very, very powerful to connect all those different sources of conversation, unstructured content, and structure content in a way that you now you can create some new values. You don’t have to go back and clean up every data repository for years and years. You just have to have this at- scale data processing machine with this new kind of technology that enables you to kind of create insights and recommendations.
Curtis: And then once you have that, of course, then, then you’re talking about kind of this human-in-the-loop concept or people being highly involved, right, in tuning these things and making sure they’re working for these really small niche use cases. Can you talk a little bit about that and maybe how you’d approach that or how, how that’s done correctly?
Robin: Yeah, I think, you know, there’s a lot of buzz about chatbots nowadays about how to use chatbots to make extremely high volume known conversations and automate them.
I think that’s kind of interesting, but it’s fairly tactical use case mainly because I don’t think it addresses the human potential. I don’t know how you feel, but I hate talking the chatbots when I go on a customer service site. It immediately tells me, “Oh, this company doesn’t want to talk to me.”
Curtis: Sure. Yeah. I mean the chatbot, you know, they rarely get it right.
Robin: Yeah. Yeah, the most amazing chatbots, you know, help me find something and then connect me to something that I really, somebody I really want to talk to. And after that, you know, I’m off to the races. So I’m more interested in, in more advanced things. So when people, when engineers are trying to solve problems every day to run IT systems or build better software, salespeople are trying to navigate a lot of different pieces of information to navigate their sales process and take, figure out what the right set of next steps are.
Those are high cognitive load tests that require experts doing things, you know, based on their personal experience. So what I believe the next opportunity is, is really bringing them recommendations or advice saying, “Hey, here’s what we’re seeing. Here’s what we think would be useful for you while you’re doing this particular task that can help you be better.” So don’t over promise that, “hey, we’re trying to automate your job. No, we’re going to help you do better at your job by giving you advice and insights while you’re doing your work wherever you’re doing it. Right?” so you want to embed these insights in whatever tool or interface that human app has to be using at the time.
Curtis: And are you talking in this case, uh, mainly about engineers, like software engineers, those, those kinds of people? Or sort of across the board, any anyone trying to get something done you’d want to be able to,
Robin: Yeah, I think what the market opportunity is for any business user, right? So it could be a salesperson. It could be a financial analyst. It could be a support or service person. We in particular are focused on the tech market.
No, no surprise given my career. That’s what I’m passionate about. I kind of spent my career traveling around the world, getting yelled at by a lot of customers who were unhappy with using my products because they can’t, people cannot understand them, or can’t take advantage of the latest and greatest release for features.
So I’ve seen the gap myself in the things we create and how difficult it is for people to use that value quickly. And so we’re kind of soaking in that problem. How do we help, you know, developers, cloud operations teams, uh, tech support people be dramatically more effective by giving the advice while they’re trying to build and release and operate advanced technology.
Curtis: It’s an interesting problem space, everybody who is a developer knows, you know, they’re Googling half of the stuff they’re trying to do ’cause like there’s so much knowledge out there. No one can have it all in their head. Right. So, but sometimes you don’t even know what to Google for or like, how do I solve this problem?
Like what, what is the terms I even use? Or what do I look for? Is that the space you’re, you’re trying to make more effective or is it, is it broader than that?
Robin: Yeah, well, we kind of have a longer-term vision of, uh, but you know, again, where we’re starting right now is on helping people get advice on cloud native technologies in particular, which is kind of the hot thing on the planet right now.
And so there’s so much need, and there’s so much change that’s happening very fast, that it’s a great place for us to add value. So, going back to you, your description, you’re absolutely right. Most technical people when they have a problem and trying to learn something, they go to Google, you know, and they try to master keyword search and they get a lot of stuff back. Right?
And you know, maybe, maybe they’ve kind of cracked the code on exactly the right keywords to get their answer, but most of the time they do not. And so that natural place for them to go to is a place where experts are congregating or these big community forums. They kind of the big daddy of all of them is Stack Overflow, which is, I think now the fifth most operative website in the world.
And so if you’re, if you’re a business person listening to this, you might not know of it, but all your technical people do. There’s 125 million monthly active users that come to Stack Overflow, looking to get answers to technical questions that range from programming language to, you know, using the latest and greatest cloud native technologies like kubernetes or whatever variant of, you know, data-based open-source database we’re using, right?
So that’s where people are going. And so what we see is the potential to help people answer questions faster on these massive public communities. These spaces are great gathering places that attract all the best minds. But what we’ve seen is that there’s not a lot of things out there to help people answer those questions faster. And so that’s where we think we can initially help in our first offering.
Curtis: Tell me maybe a little bit about how that works. How are you making those recommendations, like how do you even approach that? It’s, it’s a multi-faceted problem. How do you, how do you do it? How do you make it work?
Robin: Well, again, going back to our initial conversation, the first thing to realize is that the data is everywhere.
So let’s take a topic like Kubernetes, which is, you know, very very hot right now. What we do is we look for every place that those, you know, active conversation on Kubernetes, whether it’s stack overflow or, you know, the 15 other people that are creating distributions and product variants of Kubernetes.
So we kind of to take all those conversations. We take all the published content around those products, in whatever form it might be, might be a document of blog post, a knowledge article, et cetera. And we actually, you know, just got in and start training our models, looking at all the Q&A history or a conversation history on these forums to have some initial value on making recommendations through request.
So that pre-training step is super important because, going back to we talked before, we take all these distributed sources of information, and then we apply a knowledge draft technology to interpret those pieces of information, to make recommendations on an answer to a new kind of question. And, you know, we’re not right a hundred percent of the time, but we can certainly be a lot better than a Google search, especially for a more nuanced question where people are, are kind of, people think in these questions in terms of narrative, not keywords, usually. And so we want people to type their question in more natural language and then we’ll give them a recommendation saying, “hey, here’s a rank set of things that are look like they’re most appropriate to the problem you’re trying to solve.”
Curtis: So essentially, I mean, what you’re talking about is something where someone can, much like Google, they type in a question, but it’s, it’s so targeted to the development space and it takes the whole phrase into account when it’s, it’s searching for like, “okay, here’s the answer. Here’s something you need to check.” Is that right?
Robin: Yeah. We’re kind of soaking in kind of the IT domain. So that means we’re building up a set of language and techniques that interprets the intent of the question. And can extract the best value of all this kind of structured unstructured content that we’ve ingested to make, uh, to make a recommendation.
Curtis: That’s great. I mean, there’s lots of questions there we could dive into. One, one on the top of my head is how do you determine the quality of the source? Right? So like a lot of times these are really technical things, lots of technical code. How do you determine like, “Hey, this guy knows what he’s talking about and this guy doesn’t,” you know, like that’s a hard problem.
Robin: That’s a great point. And I can . . . the contrast for search versus what we’re trying to do with AI. You know, search is a, is a big horizontal platform. Google search, right? It doesn’t score what is trusted or untrusted content because it’s going, trying to develop things at scale or deliver answers at scale. We look at all of the ratings of the people that are posting answers.
We look at what answers have been voted or upvoted or have resolved a question, and we use various techniques to bubble that those answers up when we find a similar match. So there’s lots of signals in this data that indicates whether a person or a piece of content is highly valued by the user community.
And that’s really invaluable when you’re gonna combine that with that kind of domain focus on the IT industry.
Curtis: So if I’m searching on, on the tool to find something, you know, a developer I can kind of trust we’re picking up signals to say, “yeah, this is. This is a good answer, you know, and, and filter out the rest.”
Robin: Correct. In fact, one of the cool features that we were kind of we’ve added in and I think is an indication of where we need to go with our offering is you can actually, when you’re typing in a question or even looking at a question on Stack Overflow, you can really identify who the best experts are on that particular topic.
So you can filter results by individual experts if you happen to know them or really trust what they have to say and creating that kind of personal connection to thought leaders or, or, you know, technical influencers is extraordinarily valuable, especially if you don’t know that person personally,
Curtis: Right, yeah. Yeah. Which is, you know, you can’t know everybody. Right. And, um, and then how do you, how do you guys, you know, so, so we’ve, we’ve talked about quality, and I’m curious about how you went about kind of training this algorithm. Right? I mean, there’s like you said, a lot of times when you’re training something like this, there’s lots of human involvement trying to figure out like, “yeah, like that was a good one. That was a bad one.” Not, not necessarily even in terms of quality, but just in terms of relevance. Did you find that . . . I mean, you did have a lot of technical people sort of vetting your results and giving feedback to tune this thing, or how did you approach that?
Robin: Yeah, we do we have basically our own algorithmic process that does evaluations using all the typical machine learning techniques to vet whether our initial recommendations are good enough to put in a human’s hands or to show to a human. And that may drive programs to kind of surfing these recommendations in a very easy way so as you’re looking at a question and the recommendations pop up and as people answer or click on something that’s relevant, each time they do that, it’s a signal back to us that were warmer or cooler, and that gets incorporated into our prediction algorithms. So
Curtis: That’s great.
Robin: We talked about Stack Overflow here, but it’s kind of what we think can be generalized. So imagine you’re working in your own IT support organization. And you’re doing a bunch of cloud native deployments and, you know, you’re struggling with Kubernetes like every one of my teams has because Kubernetes is not easy to master and there’s lots of variants of it that run differently on different environments. So as you were discussing those questions, you know, what we want to do next is then pop up and saying, “Hey, based on what your the discussion your team is having, here’s some advice that can help you move forward or solve that problem.”
So think of it as kind of your ongoing assistant, that’s getting smarter and smarter based on all these conversations happening, you know, on public forums that helps your, your team do to embrace these new technologies and be more successful.
Curtis: Got it. So is, is it the kind of thing that like much, like, I don’t know, like on Netflix when it knows what I like to watch and it serves that up or whatever, will this then . . . obviously it tunes your algorithm as a whole, but it is, is it also targeted to the specific user to say, “Hey, you usually search for this kind of stuff. Here’s some new content or whatever.
Robin: Yeah, it’s kind of a hot term called a recommendation engine in machine learning nowadays, and something that we’re using to describe our technology and Nvidia uses a lot, for example, to describe the next generation of applications.
Typically recommendation engines have been deployed in B2C markets, you know, like Netflix, like the Amazon shopping assistant or honey that is watching you while you work or, or watching you, what you watch and deciding what ad to serve, what movie or music you should listen to next, or, you know, a better, a better way place to buy what you’re, what they think you’re looking for. Right?
And, you know, the benefit that B2C AI has is that there’s an enormous number of user signals that they can tap into. Some of which makes people like me a little uncomfortable, but, but, but, you know, it’s actually fairly straightforward to get a lot of user signals now with the digitization that’s happened in the B2C world, right?
So it’s kind of an embarrassment of riches. What’s different about the enterprise market and these kind of other problems we’re talking about is that first of all the decisions that you’re making on recommendations are far more critical, right? You’re not just buying your next widget or your next album, which is a great thing personally. You know, the . . . your professional reputation might be impacted by the decision that you made or your success in your job. So the stakes first of all, and the secondly is that you just don’t have the same scale of user interaction. So what’s really different as you, as recommendations cross over at the enterprise, finding some way to extract the initial value out of that data is an organizing principle and finding some specific way that people will use those recommendations and upscale to get that virtual human virtuous circle of human in the loop moving super, super important.
Curtis: Got it. That’s great. I want to give you maybe the last word. If there’s stuff we’ve missed that you think is important that you wanted to discuss.
Robin: Yeah. Two things to close. One is, you know, I think the era we’re going into now of enterprise AI has a lot of parallels to what happened with the ERP CRM market. Right? So think back when all the kind of form and workflow softwares were starting to emerge with the promise of automating the back office, there was tremendous enthusiasm about it. A lot of money got spent and a lot of money got wasted, quite frankly, during that kind of peak ah hype cycle. And then what happened. As we kind of went through the trough of disillusionment, a company like Salesforce popped up and said, “Hey, we have a very easy way to consume of, you know, a very specific part of that equation over the cloud. That makes people more effective in selling stuff.
Curtis: Right.
Robin: And they just it’s like a laser in the early days on establishing that model. And I think we’re kind of in a similar stage as enterprise AI. There’s a lot of enthusiasm and therefore a lot of money being invested in some, in some cases, maybe a some of wasted money being invested.
So this year, I think is a year where we’re calibrating back to where those things, which are easy to consume, have some near-term ROI, and benefit humans. So we can build on that enthusiasm in the more incremental stairstep way, and then things will accelerate exponentially from there. So I think that’s really what this year is going to be about. For us, you know, what we talk about today is providing, helping people drive their reputation and communities onboard places like Stack Overflow.
You can, you can actually download our free version of the product. You know, Peritus.ai website today. And that’s kind of like the first hit single of what we hope will be a very successful platinum album someday. And there’s many things that we’ll be announcing over the next year along that same theme.
Curtis: Got it. Okay. So it sounds like you guys are moving fast. Is there just again in closing here, and I’m curious, is there anything that you could share in terms of where you’re taking the company and what, what the plans are.
Robin: Yeah. Well, the, the long-term vision is to create coaching network for all these tech people. So you think of people on the community, people building products, the legions of dev ops engineers, and people running cloud. We want to connect all those people together are using our AI capabilities. So everybody can kind of move faster and be more successful.
Curtis: Yeah. That’s all. I mean, making the, making the dev sector more efficient is, I mean, all boats rise in that case. Right. I mean, it touches so many things in our society, so that’s, that’s great.
Robin: I fundamentally believe that the best career track for, uh, for people that are looking for employment, especially as, as we look to, kind of, up level and recover in our global economy, you know. Hopefully our part of the, of that industry is what more people to participate in the tech sector by kind of closing the skills gap that they need. That would be our human aspiration for the company.
Ginette: A huge thank you to Robin Purohit and as always, you’ll find your transcripts and attributions at datacrunchcorp.com/podcast.
Attributions
Music
“Loopster” Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0 License