Byron Ellis is the Chief Technology Officer at Spongecell, a Creative Management Platform that helps advertisers enhance the power of digital brand creative by leveraging customer data and brand content to personalize ads for maximum relevance. We spoke with Byron about ad optimization, the evolution of big data, the ins and outs of algorithms and more.
About Byron Ellis
ML: First off, would you mind telling us about your background?
BE: I actually got my start in the life sciences. I did a lot of lab work on something called flow cytometry, which involves the study of autoimmune diseases like lupus and rheumatoid arthritis. When I started out in the 1990s, the medical field was where you went if you wanted to do big data since it wasn’t really a formed industry yet. If you wanted to work on large data sets, you did work in physics, or you went to the SLAC National Accelerator Laboratory at Stanford University. Or, like I did, you end up working on the human genome project in a lab.
I worked on the first assembly of the project and then worked on micro-rays after that, which were some of the first psychiatric-biology experimental devices. Eventually, after ten years or so, I wanted to do something else. I saw that the company AdBrite had advertised in the MStat News, which is like the American Statistical Association’s monthly magazine. It was talking about statisticians going anti-fraud, and that piqued my interest, so that’s how I ended up in advertising. I built out anti-fraud models for the Empire Exchange, which was one of the larger exchanges at the time.
ML: You mentioned that when you were getting started, digital marketing and big data weren’t connected at all. What do you think has changed to make big data such a hot topic right now in the digital industry?
BE: I think two things changed. First, the cost of collecting data became smaller than the cost of not collecting data. Storage is so cheap that the opportunity cost of missing out on something actually exceeds the cost of just keeping everything. So now, you can all of a sudden afford to collect all the data, whereas before, you had to decide whether to throw it out of your data warehouse. I think that was the main turning point. Then, we started seeing how data could be applied to make money and provide value. Google was the trailblazer here by collecting search data and then actually using it to make money. Once people realized they could do the same extremely cheaply and easily, with commodity hardware instead of extremely expensive Oracle instances, it exploded.
ML: You’ve been in the digital analytics space for about six years. What keeps you interested in the field? Will you continue on this path, or will you ever go back to life sciences?
BE: I don’t know if I’m going to stay in advertising forever or not–I would guess not. The techniques are sort of spreading out, though. The digital marketing space is good in the sense that it gives you access to a lot of data relatively consequence-free; nobody dies if I show you a McDonald’s ad instead of a Pepsi ad, which is unlike life sciences. You have the ability to work with very rich, high-volume data relatively safely. And there are so many techniques to analyze that sort of data, and those techniques are becoming more and more applicable to other spaces.
ML: Let’s talk a little bit about Spongecell. How would you describe what your company does, and what business value do you bring to the table?
BE: The business value that we bring is largely around personalization. Our industry niche is what we call “programmatic creative,” which is a relatively new name that’s come up within the last year or so. The main focus is ad personalization. There have been point solutions to do this for a long time, like ads that follow you around the Internet. For example, this happens when you look at a shoe one time in an online ecommerce store, and then you see an ad for that shoe on every website you go to for weeks after. Amazon has a more sophisticated application, a purpose-filled one that shows you products. It’s a little bit more subtle, and it also tests how certain groups may respond differently to variations in imagery.
But to personalize traditional advertising, there would literally be a room of designers who have to make every single variation by hand. Say you have a Honda ad with one variation for every model they make; somebody has to actually sit there and make those variations by hand. So, where Spongecell got started was in automating that process. We work on the back end to integrate all the different assets you need. We help lay out all the elements, and you can tell the program to make specific changes in each part of the ad. Then, you can easily test to see which ad is the best by trying different images and then optimizing based on your results.
ML: Can you describe how your solution decides which creative assets to use?
BE: We usually have one KPI that we track, and that’s either a direct measure of what we want or a proxy. So in the car industry, it’s going to be a proxy measure of some kind. We know that people who click on the ads, for example, are more likely to come in for a test drive. So our proxy signal is going to be quick.
If you’re optimizing, you always have to have something to optimize against. And from there, we use our contextual multi-armed bandit models to actually choose the ad in real time. The idea behind the bandit model is pretty simple: say you have this casino that’s full of slot machines, and the assumption is that some are going to pay out better than others. You have a bucket of money, and your goal is to maximize the return on that bucket of money. The best way to do that is to play the best machine. But you don’t know where the best machine is, and the only way to find out is to play the game. So then, the game becomes, “How do I allocate my bucket of money across this casino to maximize my return?”
Then, you start taking other information into account that you already know, which is where the contextual multi-armed bandit comes in. For example, in a casino, we know that the slot machines near the door are going to be looser, but will pay out less than the slot machines further into the casino, because the goal is to get you into the casino and buying drinks. So, you can take that information into account to predict the probability of pulling a particular odd, or predicting what your KPI would be, given the context of your impression on an ad. When you gather more and more data points (like which cars you’ve already looked at, whether you’re male or female, what time of day it is, etc.), then you can choose the direction of your ad.[bctt tweet=”When you gather more and more data points, then you can choose the direction of your ad.” via=”no”]
ML: How does a company get started using this technology? Are there any prerequisites that the client needs to have in order for the engagement to be successful?
BE: To get started, you pretty much just have to have variations. It helps to have more information, but we don’t necessarily need it.
ML: So, it sounds like you collect information about the performance of different creative assets based on who they were shown to. How do you use that information to optimize your algorithm?
BE: We optimize the internal processes that we use for the Thompson sampling, which is one piece, but we also use that data to optimize the future performance for that particular customer.
So, there are two things going on. Because the algorithm itself is very simple, you can make some marginal improvements by setting certain parameters, but there’s not a whole lot there that will actually drive performance. You can play around with different ways of finding your nonlinear regression, etc., but that involves marginal performance. The algorithms are driven more by the amount of data now.
The more interesting optimization problem is, “How do we use data in the longitudinal pattern? How do we keep the results from one campaign for a customer and actually use that information again later?” We want to enrich that data and make it more valuable. But this is not actually the normal situation, because the advertising world is very transactional. Most companies show up, run a campaign, end the campaign and then get their results–and that’s it. They may or may not show up for another campaign, and you begin again from scratch, basically.
ML: One more question: What are the next plans for Spongecell? Do you have any releases coming up in 2016?
BE: We’re releasing constantly, so there’s always new stuff. The next version of our optimization algorithm is coming out at the end of this quarter. This will actually switch up the way we track and assign audience groups. The process is called “audience-based outpatient,” and the primary purpose is to track end assignments of audience groups. (I’m on the hook for writing a lot of code for that.) Then, the thing that we always get asked about is “insight-x.” Basically, we have these predictive models under the hood, and we want to surface how those models are working and what parameters they’re using, because they’re actually using future selection to make decisions. So, we want to surface that information for the users to access for two reasons: One, because they might think it’s interesting; and Two: we know they’re not just working with us. Even if they’re doing all their digital with us, there are other channels like TV, print, etc. where they could potentially use this information.
1. The big turning point for big data was when the cost of collecting data became smaller than the cost of not collecting data. Storage is so cheap that the opportunity cost of missing out on something actually exceeds the cost of just keeping everything.
2. The digital marketing space is good in the sense that it gives you access to a lot of data relatively consequence-free; nobody dies if I show you a McDonald’s ad instead of a Pepsi ad.
3. The business value that Spongecell brings is largely around ad personalization. The industry niche is called “programmatic creative,” a newer term in recent years.
4. When optimizing, there always has to be something to optimize against. From that comparison, Spongecell uses its contextual multi-armed models to choose the most optimal ad in real time.
5. To get started in optimization with a solution like Spongecell, you need to have have variations. It helps to have more information, but it’s not necessarily needed.
6. It is optimal to enrich existing data and make it more valuable. But the advertising world is very transactional. Most companies show up, run a campaign, end the campaign and then get their results–and that’s it. They may or may not show up for another campaign, and you begin again from scratch.
7. Spongecell aims to capture data and use it long-term as part of the optimization process. Since their models are Bayesian, they have a way of integrating that data into the next campaign.
8. Spongecell has predictive models under the hood that use future selection to make decisions (these are called “insight-x” models). Spongecell wants to surface that information for users to access.
2. To learn more about Spongecell