The Reality-Based Marketing Framework - Math as the Brain. GenAI as the Voice.

Great.

Introduction: The Math and Science Behind Targeting and Budgeting

So I'm going to talk about, you know, the math and science at the front of the marketing technology. So how do you decide which of your customers get which quality of video, which quality of content, how much content, what are your budget allocations? So I'm talking about the math and science behind selecting and defining all of that, all of the targeting and budgeting logic.

So the goal is a leap of math first. It's the math and science, then budget allocation with optimization, and then execution with Gen AI. I won't talk much about that since our first speaker did that.

Speaker Background

Just a little bit about myself. My name is Gary Cernverda. I've been in machine learning for 35 years.

I was very lucky when I was at the University of Toronto studying aerospace engineering that Jeffrey Hinton was a professor at U of T. So I was exposed to machine learning. I would attend all his lectures and seminars on neural nets, and machine learning was fascinated by it.

After I finished my master's degree, it's not really an aerospace industry in Canada, so I didn't want to build weapons of death and destruction and move to the United States.

So I stayed here and found that there's a whole career to be built bringing math and science to business. And I've spent my whole life doing that,

using AIML and technology to help companies automate their processes and drive increased revenues and profits.

I ran my own business for 20 years called Daisy Intelligence which was sold a couple years ago and recently been doing advisory work.

You know I'm kind of an executive advisor, interim CTO, helping sell companies. I'm hands -on tech, I still hands -on build stuff every day so uh so i'm fully versed in this space so um so i'll start with talking about segmentation so

Overall Approach: From Segmentation to Optimization to Execution

the first step in deciding how much you should spend on which customers is really about understanding who are your valuable customers what are the behaviors that drive value because you want to market to incent those behaviors so if you want to send a video about one of your products because the customer doesn't buy that product today to get them to buy it that's how you cross sell

If you want to market premium products, you would create videos and content and emails that would incent that behavior. So the first step is understanding what are the value segments in your database? What are the different tiers of customer value?

This works in a B2B setting, a B2C setting. I'll talk about retail today because it's easy to understand, but you can apply this to retail banking, telecommunications, insurance, any industry where there's customers, whether they're consumers or businesses. businesses.

Start With the Right Data: Value vs. Behavior vs. Demographics

So the first thing is understanding data. So this gets into clustering and segmentation is to separate all the different types of data.

There's shareholder value data. This is the data that identifies value. Typically it's historical spend, profitability, likelihood to continue spending, future potential spend. So it's all the value metrics around customers.

You want to separate all the data. You have some behavioral data. You want to keep that separate from the value data because otherwise you won't be able to correlate the two things you won't be able to

identify what are the behaviors that create value to answer that question you can't throw it all into a bucket and say build me a segmentation with all this data then you might have some demographic data you know the demographic data informs the gen ai on how to how to write content it's the attitudes and demographics that you can use to create meaningful content that appeals to the customer to incent the behavioral change that will drive value so all of these are considered

Customer Value Distribution and Strategic Implications (80/20)

differently and not mixed together when you're looking at value there's the 80 20 rule this exists in every database i've ever seen that you know the top 50 15 percent of customers are 50 of value um you know the middle third is a third of value the bottom 50 is the bottom 15. of course the distribution varies.

In the most extreme industries and banking, I've seen 10 % of customers be more than 100 % of profits. Some customers lose money in banking. So if you look at the net contribution of customers, there's a very small number that makes up the majority.

So obviously, that's where you allocate budgets. You should spend more money on the customers that create create more value. You should spend less money on the customers that don't create value.

Or you want to use your dollars to find customers who could be potentially valuable, right? So you want to align corporate expenditure and customer value. And typically, your strategies become obvious.

You know, 15 % of your customers are 50 % of your value. Then your goal is retention. You want to to build retention type marketing campaigns to incent them to stay you know so you want to have as the customer has earned value from you you can now share value to lock them in and keep them you

know the worst thing is credit card teaser rates you know giving a teaser rate to a new customer is a bad idea because then customers just switch but giving a great rate to a customer who's been around for a while keeps them to stay because they know they've earned that and it's not it's hard to replicate somewhere else.

Building Value Segments with Condorcet Clustering

So the first step in identifying value is to do some clustering. So one of my favorite clustering algorithms we use is called Condorcet clustering.

This is an IBM patent. It's an algorithm that's linear in the number of records times the number of clusters times the number of variables. So it's a very fast algorithm.

You can run this on tens and hundreds of millions of records. And typically we run on the value feature.

So normally the number of features you feed into a value segmentation is very small five to nine features that define value in retail it would be historical spend historical profit transaction frequency

consistency of spend spend trend future potential you know if you're a grocer you will look at a household size of four with young children would be a customer that has a lot of future potential value.

So just the math behind Condorcet clustering, it's a similarity clustering method. You know, this is what the algorithm is optimizing. It's basically

saying if two records, i and j, are in the same cluster, and for k variables, if the kth variable has the same value, then that contributes to similarity. So it identifies how many features are similar between records and how many features

are different if you're in different clusters you should have lots of differences if you're in the same cluster you should have lots of similarities so it's an algorithm that optimizes record by record variable by variable similarity and difference between and inter inter clusters and so

the output of an algorithm I look like this typically in value segments we build you know normally start with nine segments as a starting point eventually Eventually, you might get to hundreds.

Reading Cluster Outputs and Forming Tiered Segments

The percentage on the side shows the size of the cluster, and then the features that are in the cluster. So you can see in the very top one, department count.

I like this visualization because the gray bars in the background are the distribution of the sample of all the data, the distribution of that variable for all the data,

whereas the red bars are the distribution of that variable for that particular cluster. And the features are ordered in chi -square importance to the cluster. So it shows you what are the most important features in each cluster, the defining attributes.

So you can very quickly see and interpret this. Obviously, if the red bars are to the right, then that variable is larger in this cluster than others. So it's a good visualization to quickly see that. Here's the numbers behind that.

This is a real retail example. So we see that, in this case, Cluster 7 has 9 .8 % of customers responsible for $87 million. That represents 32 % of sales. If you look at the index, Cluster 7 is 3 .3 times more sales than the average, 2 .7 times more transactions, 3 .1 times more gross margin.

And when you look at the nine clusters, they clump together typically into three or four groups. In this case, we can see there's a big difference between platinum and gold, then another jump between gold and silver, then another jump between gold and bronze, and then another jump between clay. So we kind of create a hierarchical segmentation. You can profile that on all the features that you have available.

and then you want to start understanding now that I have these value segments my diamond customers who are my best customers I don't want to lose them so we want to work on segment migration we want to make sure that those dotted lines from diamond and ruby going down those are migrations we don't

want to happen we want to find new customers who have the potential to become diamond and ruby we want to identify them early we want to get Ruby ruby opal and pearl customers moving up to increase value so those are the migrations that we care

about so those are the marketing programs that we'll execute we'll have retention programs against diamond and ruby we'll have upsell programs against pearl opal and ruby we'll have i you know identify valuable customers the new customer base promotion so the value segment kind of defines the

From Value to Action: Identifying Behavioral and Product Gaps

strategy of execution and then the next thing to do is behavior is trying to find what one of the behaviors that contribute to value and this is more in retail is typically

transaction frequency so it's the kind of marketing opportunity matrix if you've ever read the loyalty book by Fred Reichelt talks about this so it's identifying what are the behaviors that that drive value and so normally in

Behavior Segmentation Example: Transaction Frequency

retail and grocery we look at transaction frequencies those customers that shop every week in a particular category they're your primary customers If you're shopping every single week, if you're shopping maybe, you know, every third or fourth week, then that's like an occasional customer. If you're shopping every other week, that would be a secondary customer and a non -shopper doesn't shop at all in a quarter.

So we can define looking just at the data. We want the primary customers to be that 15 percent secondary to be the middle third and the occasional customers to be the bottom 50 percent of customers. And so we kind of arbitrarily, this is not a machine learning exercise, it's just profiling the data, and this is how we define segments.

So for example, grocery customers who shop more than 15 times in a quarter are your primary grocery shoppers. The secondary ones shop 5 to 14 times, the occasional shoppers shop 1 to 4 times. So now you have behaviour, you can plot that on a graph, you can see my now primary, secondary, occasional definitions.

definitions, how many, what's the average number of transactions for each of those primary, secondary, and occasional segments. We can see the difference in sales. Very interestingly,

secondary customers, even though they only shop half or a third as much as the primary customers, they spend almost the same amount, right? So the difference between secondary and primary

customers in this case, it's not about upselling more products, it's about getting them to come more frequently so doing a bit of profiling you see the gap analysis is not sales it's not items to fill a gap it's in some cases it could also be transaction frequency and we want to identify you

know what are the products they're buying what are the combinations of products they buy what are those departments they're buying and then we can start to target you know customers who are secondary and grocery how do we get them to come more frequently what content can we create to them what type of content will they respond to what are the characteristics what

Affinities and Department-Level Gap Analysis

other demographic characteristics and we can target them and execute we can build clusters product clusters look at association rules and affinities is just

an affinity output I won't go through that and then we lay value against products on the on the vertical side you can see the valuable segments top to bottom so we can see the diamond customers are primary customers and most most of the departments.

Not surprising, your best customers buy everything, right? And we can see all the little product gaps between every layer.

So if you wanna move an Opal fourth customer to be a Ruby third, you need to have more of those greens where occasional become more secondary and more secondary to become primary.

So this identifies all the little micro gaps behind every cell is a number of customers. You know who they are, you know what the gap you now wanna close is,

whether it's frequency or product and now we ask who are they who are those customers so we can go create that content you know we've now defined our strategies we're doing retention frequency win back incentivize we're upselling cross -selling a secondary strategies there's the number of

Budget Allocation and Channel Optimization

customers in the in the in the flask column now we need to figure out how many campaigns per month are we executing what's the actual budget we're going to allocate to each customer so this is

Yield Curves: Estimating Returns vs. Spend

where we look at historical data and we build yield curves so we built this segmentation based on historical data we can now look at what's our historical marketing that we've spent on each customer and what has been the return we can build a yield curve which is kind of vertical axis is the return you get and the marketing spend or media spend is the horizontal axis so we can see

a yield curve for every customer and every segment and we can start to understand And what are the characteristics, what different types of spend have we spent on social media? Have we done traditional print? Have we done display ads?

Have we done search ads? We can look at what we've historically spent on each segment. Here's a real example.

This is for a client, a sporting goods retailer doing marketing and social media. So you can see the yield curves on all the different social media channels they would spend on. they'd spend probably 20 million dollars a year in social media advertising and so we looked at the

yield that showed uh you know the return the cumulative cost versus the cumulative weekly sales and we can now start to optimize um that spend we can see some of the total net impact some of the channels had a negative impact on on the uh on the total weekly sales some of the

channels had positive impact on weekly sales so now we can start to run an optimization to say how should i reallocate my spend obviously i should spend less on the negative one so

Reallocating Spend with Optimization and Constraints

there's all the different features the value features i'm looking at and all the different social channels and we can see which channels did well which channels did poorly then i can run an optimization to say you know qualitatively i can say spend less on the crappy channel spend more

on the good channels we can run an optimization set some minimum constraint of 150 000 minimum per channel set a maximum per channel and they were spending in the past like 25 million we were able to reduce the spend 22 million and maximize this uh the results right so now we've identified

Execution: Using GenAI After the Targeting Logic

who the value is what are the gaps in frequency and product spend and behavior we know demographically who they are if you've done any surveys or focus groups we know psychographically their attitudes and interests and then we can

go over to a tool like studio verse if you want to create videos or if we want to create customized emails using gen AI this is where we hit the why why does the customer why do we think this customer will buy this product they're

not buying today how do we get them become two or three more times a month are they price conscious do they value brand do they are they high quality shoppers do they want to be recognized do they want special perks outside of my store can i offer all of these things so these are all the levers of understanding who the customer is what their demographics and characteristics are and then

And that'll guide the gen AI to create meaningful content to execute what the math and science surface, which was product gaps, frequency gaps, spend levels, so we can maximize future potential.

And so we've done this with customers and successfully grown top line sales by more than 5 % in grocery, more than doubling the net income of large grocers.

Measurement: Migration Scorecards and Quarterly Refresh

so the final step is to build a migration scorecard this is your segment so your original segments on the vertical axis you know platinum gold silver those are all the segments we created

we want to give marketing you know three or six months to work we don't segment the data every day it's too noisy you want to see you know that is my marketing activity working so you want to save my new segments three months later how did the how did those migrations happen so green means

they stayed the same. Light blue means that they moved up one segment. Dark blue, they moved up two segments.

Yellow, they went down one. Red, they went down two.

So really exciting that 76 % of my platinum customers, I was able to retain them over two consecutive three -month periods. So now I know that next month, next quarter, I want to do better.

I want to set the target to 80%. And I want to see how effective were the campaigns that I did do campaign promotional analysis.

analysis you know what I see I lost a number of customers I lost you know probably looks like 17 % drop down one segment from platinum to gold so I want to understand why that happened and then the remaining 10 or so percent actually dropped down more than one segment so this is a

scorecard of how well my marketing is doing and I would refresh this every quarter and then reset set my objectives and do this over and over again set this up all automated you you build these and build the workflows and you can wrap an agentic framework around this and execute marketing campaigns every day and content

Conclusion: Automating the Loop for Ongoing Results

creation every day and generate incredible financial results so that's a

It's a high -speed walkthrough, 33 slides. I'm happy to answer any questions.

Q&A Highlights

So the Maple Leafs are looking for a data expert for their general manager. Are you open to that job? Because I think it's fantastic.

Take me to the test. Yeah, yeah, above the Leafs.

GenAI vs. Decision-Making AI (Reinforcement Learning)

Yeah, I mean, the data analysis is something that's not talked about. I think Gen AI is talked about a little bit too much. I think Gen AI will not do the hard math and science

that, you know, traditional machine learning. I'm a big believer in reinforcement learning. Reinforcement learning is the future of AI.

This is where decisions, like Gen AI doesn't make decisions. It's good at generating content and plausible answers, but it's not a great decision -making framework.

Whereas reinforcement learning or control theory, what NASA does, you know, what these autonomous drones and what fighter aircraft do, our autonomous control control systems using reinforcement learning, and that's, I think, the real branch and the science behind some of what I went through here.

Any other questions? Yeah, sure.

How to Tell Whether Clusters Are Meaningful

I don't do much unsurprised learning, but with this conserved clustering method, when we work on new data set, how do we know that the clusters that we come up with are meaningful in terms of representing cohort?

Yeah, well, that cluster, the cluster index, the Condorcet value is a value between zero and one. For the global clustering, it'll give you a value between zero and one would be a perfect clustering where every cluster is on every variable is the same within the cluster. And between the clusters, every variable value is different. That would be a perfect clustering.

But data is never that perfect. so normally you would see a condorcet value for the global clustering you know that's in the range of 0 .6 to 0 .8 is a good clustering and then you can look at it also scores individual clusters

to say what's the score for the cluster so you can see which clusters are more distinct than others and you can look at individual features and individual records to see how much these records contribute to the clustering as well so you can really diagnose the clustering the algorithm also

gives you a second choice cluster that if I hadn't put it in this cluster I would have put on this other one so it allows you to really diagnose if the clustering is doing well in my experience this condorcet clustering has

been the best algorithm k -means is terrible k -means plus neural clustering is not very good I think this condorcet one in my experience has been very successful for me personally

yeah you want to see that difference if you recall that slide where i had the platinum segment was three times more than the average like there was a there was nice grouping yeah like the deviation and spend i think where were they yeah like on this one here you can see

that the value is monotonic from top to bottom the highest value you know goes monotonic there's big big jumps between groupings, like three to 1 .7 and 1 .69. That's why I call them gold, because they're close together.

Then there's another big jump to the next group, which is, you know, 1 .04 silver, that was on its own. Then a big jump here to bronze,

it's kind of 0 .5 to 0 .75 on the index, and then the bottom to clay. So you can see it's monotonic.

You might see these big jumps, that's telling you there's a big difference between the customers.

I've always seen with this clustering algorithm with customer value data, it ends up looking like this. And we typically, as I said, start out with nine segments

and then aggregate them up to four or five, like the platinum, gold, silver, bronze, clay. So if it looks like that, then that's a good result that you could then start to do gap analysis.

You say, well, what's the difference between platinum gold? Is it frequency? Is it product? Is it size of average transaction?

action like in grocery you know the more mouths you have in the household you know if platinum customers are all four -person households and you have only one person households in gold then maybe it's it's impossible to tell them to have more children to make you a better customer right like you can't change the demographics you can alter behavior right i was curious if there

Counterintuitive Findings from Segmentation Work

were any uh interesting results in this analysis that may have given you courses of action that that were counterintuited for obvious and extra -traditional parking spaces?

Yeah, I mean, one of the things we did, we worked for like a office supplies company. We found that when you're selling laptops and technology, it's not the actual laptop or the device

that drives the interest in the purchase because the margin on those are so low and every retailer has roughly the same price. It's the accessories.

accessories so we found that whoever has the best bundle of accessories that drives laptop sales so that was very counterintuitive and opposite uh to what the what the retailers doing they were like you know putting laptops on their flyers and on their website promoting the prices but if you look shop between staples and best buy and all the all the vendors the prices aren't that different but

when you looked at the accessories they were wildly different and so we helped one of those retailers we worked with helped them to market the accessory packages and that really increased the overall sales of the of the laptops now they weren't

making as much money on those high margin devices but what they would then get by selling more of them they would get better deals from the vendors of those of those devices so there may be first in line to get the next next version of iPhone or the next laptop that came out the other questions

Data Size Considerations and Practical Limits

Is there a data set that's too small?

No, I mean, I mean, so I've worked with like the largest data sets I work with are like Walmart and Albertsons in the U .S. Like they have like an $80 billion company. So I've worked with like 200 terabyte of like transaction data.

We get the receipts data as what's behind all this. Like so every little paper receipt, every item purchased on the paper receipt, that's a record that goes into the database. so every line on the receipts a record in the database now you don't need like it's it's more

you know how many customer records how many transactions it doesn't have to be 200 terabytes i mean i think uh you know if you have ideally for like if you're looking at a product type of situation this case these are product businesses that have thousands of products if you had

you know dozens of products let me work for a yarn manufacturer which had like 70 million in sales they had maybe a thousand thousand products and you know pretty small numbers their data sets were quite small so it works as long as you have enough levers to pull like you have if you have one product and ten customers okay

there's not many levers to pull right like if you're selling Goodyear tires you buy tires once every five years it's not a weekly marketing campaign to sell tires you know and Canadian tire every winter they just remind you hey it's time for winter tires you know like you know this so it's not a that's a

different game so I don't know what business you're in but if you have a number of products a number of customers you have a lot of levers to pull your average transaction size is three to five items then there's enough levers to pull this kind of stuff right but you can apply this kind of thinking the 80

20 thinking is just a good way to look at it right for the marketing spend

Optimization Techniques Used in Practice

What kind of optimization technique did you use? Was it linear complex optimization or was it like linear programming or different?

Yeah, I mean these are, I feel like these are, some problems are linear so you can do linear programming. Some are non -linear that you have to use some non -linear approaches. So we tend to use biomimicry methods like genetic algorithms, particle swarms, because they're highly parallelizable.

Like I can run one population on one GPU core, so I can have millions of millions of population members. And yeah, so we, you know, we had our hardware was GPU hardware in the cloud, although we didn't need, you know, tons of cards. We really had like one or two GPU servers was enough to even service Walmart.

Like we were never spent more than a thousand dollars a month on compute, even for a Walmart that had 200 terabytes of data. we spent more on the storage and the compute storage than on that

mathematical compute so yeah and we what we did was with the one yield segment was t -bills so once your algorithm starts investing in t -bills then you've hit the limit of your marketing budget so because all these yield curves

typically they grow and they turn and fall off there's an inflection point where beyond some spend they drop so we put like a T -bill segment and that would tell us the total overall marketing spend and yeah for these ones it was

like usually biomimicry methods that we used by default even if the problem was linear just because we had the pipeline set up it didn't matter you know that you know like non -linear looking at the non -linear algorithms or heuristic algorithms will work on linear data as well sure we were doing typically year

Evaluating Campaign Impact with Year-over-Year Controls

over year so we never looked at one I saw our goal was to say how do we make this week more profitable than the same week a year ago so because I have every single receipt I know I know that if I reckon in this flyer let's say you have

a flyer or a promotion you have these hundred items I now know here's the hundred items I recommended and I could see here's every transaction that contained at least one of those items I know how big those transactions were how

what the sales of those transactions were what the profit of those transactions were how many items were in those transactions then i say let's look at last year's 100 items same week last year like let's say the week before christmas and say last year's 100

items how many transactions did they drive how many what size were those transactions what was the profit of those transactions so i'm comparing like a year over year same store uh transaction set and saying look these this year's transactions were more profitable than last year's transaction

and you know that's the control cell in in retail is really year -over -year same store entire company transactions right and so we're looking at the core marketing channels like flyer for grocery or the website so that it's the

largest impact there other minor impacts are the same you know roughly the same all the time so you don't have to correct for everything we found that this worked and it you know we were able to drive you know five percent total company revenue in

grocery which like doubled net income like for walmart we were helping them drive hundreds of bill hundreds of millions to a billion dollars a year in incremental sales you could use an

association rule when you find similar baskets and see how they do yeah well yeah like we well we just look at you know if milk was one of the items we promoted we'll look at all the the transactions that had milk in it and then we compare last year you had a different brand of

milk well this brand of milk did better because this year it had 10 000 transactions last year it only had 900 transactions and this year's 10 000 transactions were worth 100 000 last year those transactions were only worth 90 000 so there's an incremental 10 000 so you measure it

against the year -over -year difference for a same store group so it's like for the for 100 stores you look at a hundred you know the same hundred stores year -over -year and most retailers look at that same week year -over -year you holiday adjust so

you know if it's like Easter moves or Ramadan moves every year so you'd compare the week before that holiday it wouldn't be exactly the same calendar week but it would be the same conceptual you know consumer week

Finished reading?