Woman Using Computer

Building a Machine Learning Company that Decodes Web Analytics, with Per Damgaard

The most important thing is to have an AI-enable infrastructure. It sounds very boring, but that was the learning that I got from the bank as well. It’s actually very easy for us to build the model, but what took a long time was to have the AI infrastructure that enables us to do so.

Per: The most important thing is to have an AI-enable infrastructure. It sounds very boring, but that was the learning that I got from the bank as well. It’s actually very easy for us to build the model, but what took a long time was to have the AI infrastructure that enables us to do so.

Ginette: I’m Ginette.

Curtis: And I’m Curtis.

Ginette: And you are listening to Data Crunch.

Curtis: A podcast about how data and prediction shape our world.

Ginette: A Vault Analytics production.

Ginette: Before we get into this episode, let’s bring you behind the scenes at Data Crunch. We’re going to show you what we’ve learned about your tastes so far.

According to the podcast analytics, which are still rudimentary and can only tell us so much, you really liked our last episode with DataOps. You also enjoyed the “No PhD Necessary” episode, the “How Artificial Intelligence Might Change Your World” episode. Almost all of you have loved the history of data science series. In fact, the third one in the series is our most popular episode in terms of how much of the show you listen to. But in terms of sheer listening numbers, the Hilary Mason episode, titled “The Complex World of Data Scientists and Black-Box Algorithms,” tops our charts, with the Ran Levi episode, titled “Deep Learning—A Powerful Tool with a Name that Means Nothing,” coming in second place. What this seems to tell us is you like interesting data history, you like interesting projections into the future, and you like learning practical ways you can be successful with data projects.

But since the podcast analytics are still rudimentary, we want to hear if our conclusions are correct. So if you want to steer our future seasons, let us know what you want to hear more about by filling out a short survey. Just go to datacrunchpodcast.com/survey, and we would love to hear from you!

Today we talk to the cofounder and CEO of a Danish company that employs machine learning to gather insights on what content on your website leads people to take action. If you’re looking into building a company using artificial intelligence or machine learning, this episode will be of particular interest to you because he talks about the impetus for his idea, some tools he used to build his product, some challenges, how he hired his team, when he uses or discards algorithms, and how he packages his product. And you can even try a free version of his product, which he mentions at the end of the show.

Per Damgaard Husted: My name is Per Damgaard Husted. I’m the founder and CEO of Canecto. Canecto is a new way of doing web analytics based on machine learning, and the reason we do machine learning is because we want to understand the intention of the users so that we can predict how they are interacting on the website. We focus a lot on how content influences people to make decisions on a website, so it sort of compliments the user journey that you have and the UX and the SEO, but we focus on the content.

Curtis: So how did Per come up with this idea of extracting insights from users’ interaction with content?

Per: The background was that actually I needed this tool. I was a manager in one of the big Danish banks, and I was in charge of the online banking elements, and I got a lot of traffic, or we got a lot of traffic statistics about what’s going on, but I didn’t really know anything about that users’ intent. I wanted to make our website better. I wanted to understand what motivates them. I wanted to understand what content we produced. We produced a lot of content in the bank, and we had no tools that could explain how the users’ interaction with the content drove them to take specific actions, and we really needed this, and then I couldn’t find any of those tools, and then—I don’t have a technical background, but I know that machine learning is pretty good at taking patterns based upon users’ behavior, so I thought that there might be a way of doing this, so that’s how it started.

Curtis: Personally experiencing a pain point like this as a manager and having an idea of the strengths of machine learning helped Per create his own solution to the problem. He got a team together to build machine-learning algorithms, and he finally was able to determine what content on a website drove users to take action.

Ginette: So how did he finance his startup journey? It helped that the Danish government liked what he was doing enough to partly fund it.

Per: I was an IT manager in in the in the bank, right? So and I worked with tech people all the time, and I have an understanding of how you can use and apply technology, so we sort of built a proof-of-concept on what this could work, right? So we showed that machine learning can actually be used to explain people’s interaction with content. So we got some findings there, and then in Denmark, we have something we call pre-seed funding, which is if you have like an interesting project, then we have some companies that [are] sponsored by the government that go in and finance such companies so they can get sort of a kickstarting process. So we are financed partly by the Danish government that has invested in us, and then in the three to five years’ time they would withdraw the money and invest in other similar companies like ours, and in that time of course, we should be up and running for ourselves.

We are very much focused on understanding content, so we use natural language processing to understand the content of all the websites that we analyze, so we have some algorithms that understand each page of the website, and they understand the importance ranking of those pages in between, so they know that this is the most important page about, if you have written about WordPress, than this is the call WordPress page and so on, so the machine learning knows this, and the same goes for the pictures and all the other content limits you have on your website, so we use machine learning for that, and then we use some machine learning to learn about the users’ behavior, so we know the preferences if they prefer long text or short text and so on. Right now, we have applied  about 20 different AI models, as we call them, that does different takes within the application.

Ginette: Per partly built his team by working with people at the Technical University of Copenhagen.

Per: I’m right now sitting in the Technical University of Copenhagen because they have at the Technical University of Copenhagen, they have some pretty good AI or data scientists working here, so we started working with students actually. And then as some of the students graduated, then we hired them as full-time employees, so that works really well for us because our CTO, he has 20 years’ of background from the IT business as IT manager, so he knows the processes and the pitfalls, right? But he doesn’t know the new technologies, so we have a guy who knows about how to build applications, and we have a team that knows how to work with the latest technologies, and that’s a good fit.

Curtis: What tools did he use to build his company?

Per: Everything is run on Amazon Web Services, and then we use Python for everything basically, some Java. And we have designed our own models that does everything, because that was the easiest and fastest, and now we can sort of easily reproduce new ones, so that works well. Of course we use some standard components, but the models is designed by ourselves. We built them from, I wouldn’t say from scratch because they used some libraries, but it’s our own models that it’s not based upon other components, not that much.

Ginette: So how does he decide on algorithms to apply system wide, and what is the most important thing he learned while building his machine learning business?

Per: The most important thing is to have an AI-enable infrastructure. It sounds very boring, but that was still learning that I got from the bank as well. It’s actually very easy for us to build the model, but what took a long time was to have the AI infrastructure that enables us to do so. All the things I mentioned before with the natural language processing and so on, that requires that all the findings is put in in an infrastructure that allows you to run the AI/machine learning algorithms on top of that. So it took a lot of time for us to have this ready, but it also means now that we can apply new models very fast. It takes on average a week for us to build a new model, and then we scratch maybe two-thirds of them because they don’t give enough business findings. But one third of them actually turns out to work pretty well and then we would apply that, if it makes sense, into our system.

For us we don’t even have a legacy system. If you’re going to apply AI into a legacy system, then you’re really in trouble because none of them are designed to do so, so we have a lot of trouble, and I think you probably know that. That that’s always what people complain about, but we built an infrastructure from scratch that is 100% designed to do only machine learning, and that’s what we’re doing, right? So that gives us a great advantage right, now I can see that, but it was tough building, I would say.

Curtis: Per is highlighting something that I’ve also seen over and over again in the machine learning and data science world—there are almost always more challenges to overcome in the infrastructure side than on the modeling side, and more so if you have old legacy data systems. The problem is in the plumbing. Thinking through the technologies that are going to store, manipulate, and move your data can get really complicated really quickly. And then you need to think about source control tools, data quality assurance systems, workflow and dataflow technologies, and more. But if you do a good job setting up this infrastructure, as Per and his team did, you can do amazing things.

Per and his team built a product that uses machine learning algorithms specifically with the singular aim of identifying what content converts a visitor on a website.

Per: We are very focused on what makes people convert, and when we say convert, it’s all the actions that you want to have them to take on a business website that could be like sign up for a formula, go to a specific page, download PDF element—whatever action creates business value, so what we want to use content for is to explain the interactions people have with the content can help understand why people take actions on the website they do, thereby sort of giving in to what’s the intent of them visiting the website. Because if you think about it, the reason people interact with the website, or visit it in the first place, is not because of the UX or because of the SEO or anything, it’s because of the content of the website, so if you can understand what motivates them to interact with the content, then you can just produce better quality on a website.

Ginette: Per’s company offers a unique perspective because it opens up a different way to view website content.

Per: It’s a process because what they know when they see our findings is that . . . they would sort of see the content from the users’ perspective and it’s not the same way that the users’ experience the content as you would imagine, right, so so you get a different perspective on where they have been, what their interest are, and so on and we segment, we would sell them different segments. We would sell them the users who  would convert, and the users that didn’t convert, what’s their interest, and sort of what has been the different in both those segments, and that’s really interesting findings for what motivates people to take those specific actions.

Curtis: So what kinds of insights are people gleaning from this software?

Per: First of all from ourselves, right? We learned a lot using our own software in terms of how we should design and what we should write and what people are interested in in our sign up process that has changed a lot, and then also all of our web shops that use our software, they sort of get the same finding. It is that they are very good at writing product descriptions about what they’re selling, and that’s where their heart is, but they all learn that actually what matters to the users is also the boring stuff, right? It’s the terms and conditions and the shipping pages and so on. They always pop up those elements as something that are relevant to the users in order for them to make purchasing decision, but they never pay enough focus to that, so that happens with every web shop we analyze.

And that’s because you see it from the user’s perspective, right? So it forces you to see how your users actually interact with the content not with your pages, but how they perceive and interact with what you’re selling them on the website.

What we have also done, we also want to automate the web analytical process because a lot of people don’t want to do web analytics and so on, so so that is the other dimension that we are working, so we give users specific recommendations on how to improve the website without them having to do the web analytical process, and that market is really interesting because a lot of the people who need web analytics from their business perspective are not very analytical in their mindset because they’re sales people, they’re communication people, and they work differently, and they think differently. But they need the analytical findings in order for them to be better at their job.

Of course, they have some business insights that a system could never have, right? Because they know their customers, and they know their findings, and they know their products and so on, so they know a lot of stuff that is relevant and applicable, if you have to work with web analytics, but there’s also a lot of tasks that they do that is in the area of being automated, and can be done faster, so you can say that using our tool makes them more business-oriented because they can spend more of their time facing the customers and doing the stuff that probably create more business value than looking into spreadsheets and numbers.

So having a tool that has automated the web analytical process and give them content recommendations, brings great value, both to, we talk to some really big companies, but even the small mom-and-pop web shops that are quite successful, they also need a tool like this in order for them to do just better than what they are doing.

I don’t think that web analytics is something that is going to have that much attention but, in in the future, so what people need the web analytics outputs, right? But the task as a human to perform web analytics will be obsolete because algorithms can do this faster and better, and publish the findings and react to it in a much better way than any humans will do, and that’s a different mindset, right? So that’s . . . in a way, we don’t compete with Google analytics. We compete with the people who work with Google Analytics.

Ginette: So what’s been hard for Per as he’s built his company?

Per: Of course it can be frustrating, like, as an example, right? We haven’t really cracked pictures yet the way that we would like them to crack. Obviously, pictures has meaning on how you interact with content on a website, right? And we have looked into different elements like, is it the picture size? Is the number of pictures on the page? And so on, so so we have different theories on what could give business value and what could be interesting to know, and then we design models around this to see if, like picture size, does really matter in terms of how users act on the website, and maybe it does, and maybe it doesn’t, and if it doesn’t, then we don’t apply it even though the model works, so we’re sort of testing it to see different elements of that, and we have the same issue with the video. We have some findings with video, we can we can show something. We just don’t think that what we have right now is good enough, so we’re building some new ones that can give us better business results.

We will crack it eventually. We just need to figure out, you know, how to do it in a way that gives credible business value, because if it’s not substantial business value . . . if you can’t change your website based upon our recommendations that we get, then it’s not really value. Then it’s just noise for your users because then you would tell them something that they can’t change or apply in their world on. Our benchmark is always we have to be able to give recommendations and insights, and if we can’t do this, then we’re just providing data, and we don’t want to provide data. We want to provide something that has higher value than that.

Curtis: Finally, we have a request from Per for our audience.

Per: I would like people to try it out to see how it works, and more than that, we would like feedback. It’s always nice to hear suggestions, how we can make things better. What works and what doesn’t work. You know, all these elements could be really great to get some some knowledge about because we build this with feedback from from lots of people, so the more they can tell us what they prefer and like and give a hints and so on the better.

Just go on to our website and there’s a chatbot and just a contact formula, so install the script, see how it works, and then take it from there. There’s a free version that you can use, so you don’t have to pay for it if you just want to try it out.

Ginette: To take advantage of this free version, go to Per’s company website at canecto.com, that’s c-a-n-e-c-t-o.

A huge thanks to Per for speaking with us, and for the show transcript and credits, go to datacrunchpodcast.com. And if you’ve like our work so far, we’d love it if you left us a review on iTunes!

Sources

Photo: Photo by Christina Morillo from Pexels

Music:

http://freesound.org/people/frankum/sounds/157330/#
https://creativecommons.org/licenses/by/3.0/