Interview with Dan McCarthy: Customer Centricity + Predict Disruption

Estimated Reading Time: 19 minutes
March 31, 2016

DMDan McCarthy is the co-founder and chief statistician at Zodiac, a predictive analytics solution. We talked to Dan about statistics, customer-based corporate valuation, lifetime value measurement and more.

Key Learnings:

1. In customer-based corporate valuation, the process is greatly simplified if you’re able to access internal transaction logs of the company you’re working with.

2. In order to effectively do customer-based valuation, you want to know the number of customers that have been acquired over time, and the total number of customers who have not churned yet over time. This is difficult for certain businesses where it’s impossible to know the total size of the “active” customer base at any given time (like Amazon.com, for example).

3. The original idea for Zodiac was conceived out of an unrelated project on childhood asthma. Dan and his partner, Pete, usually use probability models to predict what customers will buy in the future, but they found that you can actually predict the incidents of asthma in the future, as well.

4. For Zodiac, it’s been easiest from a sales standpoint to tackle retail-type businesses, because it’s easy to prove ROI there.

5. Dan wants to make better predictions for businesses that are hit by seasonality or important one-time events. He also wants to expand his focus upon mobile gaming, where there is access to the customer’s visit habits as well as their purchasing behavior.

6. In mobile gaming, about 99% of sales are driven by less than half of 1% of customers. Predictive behavior tells you just how many of these customers are going to come back to visit, and then you can model conversions as opposed to modeling purchases directly.

7. CPG businesses should push back on outside distributors to provide them with statistics in order to assess how well their product is being accepted in the market.

8. If a company wants to do a preliminary analysis themselves, the R package would be an excellent option since it can handle reasonably sized customer cohorts. But to be able to operationalize CLV, your data infrastructure needs to be in a good place, your capture rate should be as high as possible and your transaction systems need to talk to one another.

9. A software service like Zodiac is fairly easy to get off the ground because the primary source of information that they use is just a transaction log.

10. The challenges of implementing customer valuation work at scale: you have to have extremely robust code so the algorithms maintain integrity when numbers get big; you have to have a single system that works across many companies simultaneously; you have to figure out how to get data quickly into the cloud to get fast results.

11. Zodiac operates with a customer-centric mindset; they maximize the value that they get from their customers while maximizing their value to their customers at the same time. This is the opposite of product centricity, which is what’s more commonly adopted in companies.

Interview:

ML: Welcome, Dan; thanks for talking with us today. So, I know that you’re currently pursuing your Ph.D in Statistics. Could you tell me a little bit more about how you got to where you are right now?

DM: I graduated from University of Pennsylvania’s Management and Technology Program in 2006. It’s a really great program that allows you to get degrees in both engineering and The Wharton School. And back in 2006, the thing to do when you finished was to go into finance. I love financial theory, so I did what all the cool kids did and went off to a hedge fund. I was there until 2011 and did a lot of work with company valuations. But really, my biggest passion is statistics and prediction analysis. I figured this was probably the last chance that I had to do it, so I decided to take a fairly drastic pay cut and go back for my Ph.D so I could spend the rest of my life focusing on cool prediction problems. It was probably one of the best decisions I’ve ever made.

I came back to the University of Pennsylvania, where I got my undergraduate degree, and I started working with Pete, my Zodiac co-founder, in my second year of the program. All of my work thus far has been primarily prediction-based, where you model systems and predict what’s going to happen in the future. Coming from my background of company valuations, the idea of customer valuations was both very natural and also extremely interesting. It’s the same type of work, but on a customer level instead of a company level.

ML: You mentioned that you initially went into the financial sector. How did you like it? What made you decide to leave?

DM: The work that we did was primarily based in fundamental valuation. We did some quantitative screening to identify companies that could be good prospects for deeper dives. Once we took the deep dive, we’d build out the valuation models, speaking with the management team and doing all the industry analysis that you would do as a fundamental value shop. The game of valuations was one that I really enjoyed during my experience there, so it’s been a recurring theme throughout my work since.

I also published one paper with the professor here, Shane Jensen, which involved stock price predictions using a time series forecasting methodology we invented. So, the finance never really left me; even my latest paper talks about doing corporate valuations from the bottom up, which means valuing the customers and using that customer data to inform our valuation for how much the company should be worth. So, I guess you could say I actually do still work in finance, just in a different capacity.

ML: I am a huge fan of the Wharton Customer Analytics Initiative and what it does. As the resident data scientist for the initiative, what does your role entail?

DM: I first got involved with the Wharton Customer Analytics Initiative through the R package. There were a handful of people who built out the main version of the package, and I joined as a co-author. I just recently got involved with that organization, but I’ve run a few computing workshops for them, and I’ve also been one of the co-leaders of student career and class advice sessions. So it hasn’t been a full-time engagement, but it’s been great to help get folks up to speed on programming to help them build their customer analytics toolkits, and help them take the right courses to make sure that they’re able to get the most out of their college experience.

ML: You’ve mentioned your interest in valuations a couple of times, and I know that your Ph.D thesis is on customer-based corporate valuation. What does the work process look like for this type of valuation?

DM: First, it really depends on the nature of the data that’s available. If you’re able to access internal transaction logs of the company you’re working with, that greatly simplifies certain aspects of the valuation process. So, we’re going to drive certain key line items–most importantly revenue–based on this customer data. It’s hard to build a good model that will accurately project future customer acquisitions and how long the customer should be around before they leave.

From a private equity standpoint, we use transaction log data to train models for the acquisition and retention of customers over time. Hopefully, the company has transaction-level profitability data available. If they do, then we can drive what future gross profits are going to be. Then, we carry on with the discounted cash flow analysis as we normally would, which helps project out all the operational costs that are spread across the whole business, while also dealing with the capital structure and estimating the weighted cost of capital. So at a high level, that’s how it would proceed.

ML: You’ve mentioned that this type of valuation depends on the organization having a lot of these numbers, which makes sense. I’m sure you’ve encountered some organizations that do not have all of this data immediately available to them, which makes these types of calculations a lot more difficult. In your opinion, what are the key metrics an organization must absolutely have available to them in order to do any type of customer-based valuation?

DM: You ideally want to know the number of customers that have acquired your services over time, and the number of customers who have not yet churned over time. For certain businesses, it’s impossible to know the total size of the “active” customer base at any given time. Knowing this is much easier for contractual businesses, or subscription-based businesses, where customers let you know when they leave. This is more difficult if, let’s say, you’re looking at Amazon and you don’t necessarily know when customers leave, though you know when they enter. We have certain models that can handle that type of business as well, but it becomes a little bit trickier. I would assume this is going to be much easier for contractual businesses to implement.

ML: Let’s shift gears a little bit and talk about Zodiac, the company that you co-founded. What prompted you to start this organization?

DM: Pete has been building these models and publishing papers about customer valuation techniques for a long time. And when people expressed interest in implementing the techniques in their business, all he could do was give them the R package and some papers to read, so they were never able to fully engage the concept themselves.

It was actually a separate project on childhood asthma that got us thinking about Zodiac. We have always used these models to predict what customers will buy in the future, but we found that you can actually predict the incidence of asthma in the future, as well. So, we were applying the model to that scenario and thinking about how cool it would be to design a web dashboard for the doctor to be able to, very quickly, punch in a few numbers about when a patient should come back next for a prescription. As we were thinking about that, we said, “Why limit it to this? Our bread and butter is customer valuation, and we could do this on a much large scale.”

It was around that time that we came into contact with Justin Bleich and Artem Mariychin. They’ve been instrumental in turning these models into a fully-scaled business that’s running for millions of customers at a time and generating revenue. It has gone from being a successful idea to a successful business.

ML: Are there any specific industries that you specialize in?

DM: We’ve been expanding the functionality of our model’s algorithm to be able to handle different types of businesses. I’d say it’s been easiest from a sales standpoint to tackle retail-type businesses, because it’s easy to prove ROI there. This also includes businesses like hospitality, and more generally, businesses that have many small customers that often make purchases sporadically. Those are the types of businesses that we do very well with, model-wise. The models just tend to work very well straight out of the box in retail applications.

There are a lot of things that we want to do better. We want to make better predictions for businesses that are very influenced by seasonality or other events. We also want to expand our focus on mobile gaming, where you’ve got access to the customer’s purchasing habits as well as their visit behavior. Those are examples of industries that we also have the ability to handle, but retail is the industry that we’ve been spending the most time on thus far.

ML: In terms of seasonal patterns and other disruptive events, how do you account for context?

DM: With the core models, we actually don’t account for that kind of context. It is important, but we’ve found that if you account for how much customers are going to spend and their potential longevity, you’re already accounting for most of the variation throughout your customer base and specifically for the game of computing lifetime value. Often, you’ll see weather patterns and other events affect short-term purchase behavior, but not the long run. Over the long run, the main thing that matters is that you get the baseline correct. The little blips average themselves out.

ML: You also mentioned mobile gaming. I’m not a huge gamer, but I certainly recognize that a lot of games and apps have more and more in-app purchases. Are you able to predict how different segments or user cohorts will spend within the app based on their previous experience?

DM: Exactly, yeah. That’s the mobile gaming model. It’s particularly challenging because, if you’ve ever looked at mobile gaming data, you know that it’s an extremely lopsided business. You have about 99% of these sales being driven by less than half of 1% of customers. So, it’s very hard. That’s why being able to incorporate predictive behavior is so important; it tells you just how many of these customers are going to come back to visit, and then you can model conversions as opposed to modeling purchases directly.

ML: Since we are talking about different vertical segments, what about the CPG sector? Often times in CPG, nothing is actually sold directly to the consumers; rather, the products are being sold through bigger distributors like Walmart. In this case, how would an organization think about lifetime value of customers?

DM: That’s a very good question. It would be very hard to apply our current models to CPG exactly for that reason. You don’t actually get to see end user demand. I was speaking with a colleague of mine about this, and she said that CPG businesses should actually be pushing back on the retailers to provide them with some statistics so they can assess how well their product is being accepted in the market. It would be very hard to connect the dots across the stores, but you’d think that bringing in both sides would help to drive sales so that everyone wins. So, my colleague thinks that’s where we’ll hopefully see a rising trend.

ML: If an organization is just getting started with measuring lifetime value in e-commerce or retail, what should it start doing? Are there certain templates or questions that it should begin asking in order to do a preliminary analysis?

DM: If a company wants to do a preliminary analysis themselves, the R package would be an excellent option since it can handle reasonably sized customer cohorts. Doing it this way, an organization is able to see whether there’s signal and to perhaps get buy-in without spending anything except the investment of time to learn how to use the package. But to actually be able to operationalize CLV, you need to have your data in a good place; making sure your capture rate is as high as possible and that your transaction systems are all talking to one another. Those are very important things to have in place in order to get predictions from these models that will actually be useful and representative of what the true net worth of your customer is.

ML: So, you run a software business that has built-in service, and you do onboarding or consulting with a very select group of people. How does an organization get started in a software service business? Does the service just pull data from tools like Google Analytics for you to analyze?

DM: The first thing that we usually do is engage the organization in what we call a “pilot.” That’s where the team gives us transactional data for a group of customers that they’d be interested in having us perform some work on. Then, we run the algorithm and output diagnostics for them to be able to see the predictive validity. Then, we start thinking about how to leverage these results to actually use them to drive ROI. That’s phase one.

The nice thing about this is that it’s usually fairly easy to get off the ground because the source of information that we’re using is just a transaction log. We’re not asking for personally-identifying information. We don’t need demographics or age or gender or anything like that. It’s very helpful if it’s available, but you’ll get most of the juice from the transactional log. Usually, it’s very clean; it almost has to be from an accounting regulatory standpoint.

ML: Earlier, you mentioned that it’s challenging to implement this type of work at scale. Can you take us through some of those challenges as they apply to different types of clients?

DM: There are a handful of challenges that make this type of work much more difficult to do for a full company than just for one small group of customers. For one, all of the algorithms are numerical, which can cause issues when you scale beyond a certain cohort size. The algorithm may work perfectly with 50,000 customers, but if you were to up that to 2,000,000 customers, then some numbers become so big that they get evaluated to infinity. I know that’s a little bit technical, but you essentially need to have extremely robust code, which we have.

Another piece is making sure this works in all generality so we can have one single system work for 500 companies simultaneously on an ongoing basis, without any intervention. The great thing is, the models we have are so general right out of the starting gate that it’s made it much easier. But at the same time, there are a lot of situations where we thought that the dynamics would be one way and they go in a different direction.

The other element is getting everything into the cloud and being able to run some fairly involved algorithm very quickly, so that if you’re with a large company and you push the button that says “get me all the data now,” you’re not waiting for 60 minutes to be able to get those results. You want to have everything out as quickly as possible. The engineering side of things adds so much additional complexity, so we’ve been spending a lot of time to make sure we get that just right.  

ML: One more question. I saw the term “customer centricity” a lot while I was researching your work. What does this term mean to you?

DM: To me, customer centricity is short for having a customer-focused mindset. It doesn’t mean that we need to do everything that the customer asks for; it’s more that we’re going to run our business in such a way that we maximize the value that we get from our customers while maximizing our value to them at the same time. Customer centricity is the opposite of product centricity, which is what’s more commonly adopted in companies. If we’re going to release any product into the market, we want it to grow the value of our customer base; we don’t just want to grow the value of that product. If it’s cannibalizing sales to buy the product, then that’s not something that we want to pursue. So, by focusing on the value of the customer base, we can build a portfolio of products that is most consistent with our overall value.

ML: Perfect. So, I’ve heard that there is quite a bit of new technology coming out soon to help analysts (like machine learning and predictive modeling, to name a couple). What technologies are you the most excited about?

DM: One of the biggest reasons I came back for my Ph.D is because I see an amazingly bright future for this algorithmic technology. Obviously, hardware is getting better. Computing power is getting much stronger, and that’s been a big driver for our business as well, but if it weren’t for these predictive algorithms, we wouldn’t have anything to work with.

ML: One last question. You’ve been in this field now for a number of years. Looking back, what do you wish you knew when you were first getting started?


DM: That’s a very hard question. Not to sound too simplistic, but I would say that there is a part of me that wishes I had started working with Pete earlier. I’ve completely enjoyed everything that I’ve done, including all the work I’ve done with the amazing professors within the statistics department.  But here I am in my fourth year, and I’m about to leave. It feels like the time just flew by. Obviously, the silver lining is that I’m not stopping when the program is done. I’m really excited to move on to the next phase of my life, which for me is going to be a combination of being a professor and continuing to grow this business.

Resources:

Connect with Dan on Linkedin and Twitter

To learn more about Zodiac Metrics

If you would like to read more about his Ph.D work

Author

  • Michael Loban is the CMO of InfoTrust, a Cincinnati-based digital analytics consulting and technology company that helps businesses analyze and improve their marketing efforts. He’s also an adjunct professor at both Xavier University and University of Cincinnati on the subjects of digital marketing and analytics. When he's not educating others on the power of data, he's likely running a marathon or traveling. He's been to more countries than you have -- trust us.

    View all posts
Last Updated: September 6, 2023

Get Your Assessment

Thank you! We will be in touch with your results soon.
{{ field.placeholder }}
{{ option.name }}

Talk To Us

Talk To Us

Receive Book Updates

Fill out this form to receive email announcements about Crawl, Walk, Run: Advancing Analytics Maturity with Google Marketing Platform. This includes pre-sale dates, official publishing dates, and more.

Search InfoTrust

Leave Us A Review

Leave a review and let us know how we’re doing. Only actual clients, please.

  • This field is for validation purposes and should be left unchanged.