April 24, 2014 Last Updated 10:15 am

Digitally Marketing: Machine learning for humans advertising online

Optimizing online advertising means wanting your ads to appear on the mobile ad space which generates the highest number of installations (subscriptions)

A computer can learn to optimize an ad campaign promoting your app the same way you learned to read. Understanding that gets you 95% of the way to understanding how your agencies and ad networks should be optimizing your ad campaigns.

How did you learn to read? You may have read out loud to an adult and they helped you when you pronounced a word incorrectly. Maybe you struggled to pronounce a word the first two or three times you saw it. After 5 or more times of seeing the word, you read it correctly. As you read more, you learned that some words are spelled the same but are pronounced differently in different contexts. Or, words may be spelled differently but they sound the same. The point is that you generally use the memory of what you’ve read in the past to help you read the word you have in front of you*. In addition, you learn that the context of a word in a sentence gives you clues about what the word means and how to pronounce it. Machines learn to optimize online advertising in the same way.


Troy McConnell

Optimizing your online advertising means you want your ads to appear on the mobile ad space which generates the highest number of installations (or subscriptions) for your money spent. If you could spend $10,000 in 3 parallel universes on the same ad inventory using 3 different optimization techniques and their conversion rates were 1%, 5% and 10% respectively, you would say the optimization that generated 10% conversion was the best one. (By the way, spending $10,000 on 3 different kinds of ad inventory in the SAME universe to measure this is NOT the same thing – see my column here.)

Everyone wants better conversion rates but how do you accomplish it? If you’re given a chance to buy an ad on a particular page or app, shown to a particular person, who is located in a particular location, at a particular time of day – should you buy it? If so, what should you pay for it? And, which one of your creatives should you run? Optimization should be helping you with these answers.

I honestly don’t know how your agency or ad network answers these questions for your campaigns but I do know you can make a simple machine to help give you more intelligent answers. How? Let’s start with a simple picture of the problem.


In my example, you are told the publisher page or app where the ad space is located, who the user is that will see the ad and other information like their location and the time of day. I want to make a machine for you to help you decide whether to buy the ad space, how much to pay and to pick which of your 3 ads to run.

Machine learning, data science and artificial intelligence sound like rocket science. And, I guess the math and computer science behind it can be cumbersome. However, one technique that turns out to be both powerful and useful for online targeting is to use a very simple decision strategy that starts by guessing and gets smarter by remembering what happened in every past combination of factors – just like how you learned to read.

Let’s start simply. We are asked to decide to buy an ad space and we are told the page/app, the person seeing it, the time, the location of the device and your 3 different ads. Should you buy it? The important point here is that you don’t need to generalize how your ads will do for ALL possible combinations of pages/apps, people, times and locations – you just need to know how they will perform with this SPECIFIC combination of page/app, person, time and location you’re given with this ad space. You do that just like you learned to read. If you’ve never seen the word before, you guess. But, if you’ve seen the word and the context before – or a situation that is very similar to the current context -then you can make a pretty good attempt at reading it. So my machine is just going to remember how your ads did the last time it saw this SPECIFIC combination and, based on this knowledge, tell you whether to buy the space, what ad to run and how much to pay.

Here’s how my machine works. Each time I get a chance to buy an ad space, I’ll look at how your ads performed the last time I saw the same page/app, person, time and location combination. What if I’ve never seen this person before? Then, I’ll look at just the page/app, time and location combination and determine how your ads performed. What if I’ve never seen this page/app before? Again, I’ll use what I know about this time and location in terms of how your ads performed. Sure, I’ll be guessing about this particular person or this particular app/page but just like learning to read, my guess will be better than if I didn’t use what I already know about this opportunity to advertise. So, just like reading, my machine will start out making mistakes but will soon master the art of telling you which ads will work (or not work) for every ad space you are offered to buy.


Why does this work? Imagine how good your reading skills would be if you could read every book in the library. My machine doesn’t read books but it can “see” and “learn” from billions and billions of ad space opportunities and, in a very short time, it will have encountered more than 90% of any combination of page/app, person, location and time. So, it will be very good at predicting how your ads will do on them.

So what happens when you get a new set of creatives? The machine won’t know EXACTLY how they will perform on a particular combination of page/app, person, time and location you are offered but it will know how those combinations worked for you in the past and make a much more intelligent guess than starting over – kind of like when you see a word in another language and use your English reading skills to pronounce it.

How is this different? Many optimization implementations I’ve seen pick only one component – advertise to ONLY this audience (behavioral), or ONLY on these content pages/apps (contextual) – or they combine one or two components. My machine AUTOMATICALLY takes into account millions of possible combinations and how they relate to one another. For example, I know if an ad space at 2pm in Los Angeles shown to men under 35 is worth buying for your ads. You might think it would take a lot of disk space to store all these combinations. It can. But my machine encodes these combinations using clever mathematical models called logistic regression and neural networks. As a result of this encoding, the amount of disk space needed is a lot less than you might imagine.

So, there you go. That’s how sophisticated software can help optimize your ad campaigns promoting your mobile apps. If you’ve gotten this far then you’re fully qualified to say “understanding big-data machine learning algorithms is as easy as reading a children’s book!”

*Notice how “read” and “read” are pronounced differently in that sentence because of the context?

Postscript. I’ve had my head down producing a new optimization algorithm for my company. If you’ve written me an email in the last month and haven’t gotten a reply, I apologize. You may need to write me again because I have a habit of just deleting all my old unanswered emails rather than digging through them.

Troy McConnell is CEO at AudienceFUEL.

Comments are closed.