The Status Quo: AI as Archaeology

‍First-Mile Podcast with Patrick Harrington, Head of AI & Machine Learning at MetaRouter

AI Doesn’t Start With Algorithms. It Starts With Data Flow.

Share with others

Most AI systems today still operate in what Patrick Harrington describes as an archaeological model.

Data is collected at the edge, shipped downstream, analyzed hours or days later, and eventually used to retrain a model. Insights are surfaced after the fact, often in the form of dashboards or reports that explain what already happened.

That approach made sense ten or fifteen years ago. Infrastructure was slower, systems were batch-oriented, and digital experiences were relatively simple. Looking back was often good enough.

But archaeology has a fatal flaw: it is always backward-looking.

By the time you know what happened, the customer has already left, bought elsewhere, or moved on entirely. AI that only observes and reports cannot change outcomes; it can only narrate them after the opportunity has passed.

Why Real AI Happens in the First Mile

Patrick argues that the real inflection point for AI is not the sophistication of the model, but where learning occurs in the data flow.

Not downstream in warehouses.
Not after aggregation.
But in the moment, while intent is still forming.

This is what MetaRouter refers to as the first mile, the privileged position where events are generated, identity is resolved, consent is enforced, behaviour is observed in real time, and, crucially, actions can still be taken.

When AI operates here, something fundamental changes. Learning is no longer delayed. Inference is no longer guesswork. Intelligence stops being analytical and becomes operational.

Server-Side Data Is the Ground Truth

A major theme of the conversation is the difference between client-side noise and server-side truth.

Client-side data is increasingly compromised. It is polluted by bots, distorted by blockers, prone to latency, and often leaks valuable signals to third parties before the business itself can act on them.

Server-side data, by contrast, reflects what the system actually rendered and executed. It captures clean behavioural sequences, anchors events to durable identity, and moves fast enough to support real-time decision-making.

This is why MetaRouter sits server-side, not only for performance or privacy, but because AI cannot function without clean, timely signals. Garbage data doesn’t just break training; it breaks inference.

The Parallel Pipe: Where Learning Meets Action

One of the most important concepts Patrick introduces is what he calls the parallel pipe.

As events flow through MetaRouter, one path routes data to downstream systems such as CDPs, warehouses, and media platforms. In parallel, a second path feeds real-time computation clusters where learning and inference happen continuously.

Every interaction updates a living representation of intent — not a static score, but an evolving signal that can immediately inform action. Experiences can be adjusted, media spend suppressed or increased, offers modified, or even what the server renders next can change.

This is AI acting mid-flight, not after landing.

Why Identity Is Non-Negotiable

None of this works without identity.

Not personally identifiable information, and not fragile cookies, but durable first-party identifiers that separate one behavioural stream from another. Identity allows AI to distinguish individual journeys, track intent across sessions, and map behaviour to downstream activation without collapsing everything into averages.

Without identity, intelligence dissolves into aggregates. With it, AI can act surgically.

Bot Detection Is a Data Quality Problem, Not a Vendor Problem

One of the most eye-opening parts of the conversation is how Patrick reframes bot detection.

Most bot tools operate client-side and make a binary decision early in the journey: bot or not a bot. But modern bots no longer behave like obvious automation.

When behavior is observed server-side, over time and in sequence, bots reveal themselves naturally, not through a single signal, but through how their behavior diverges from genuine commercial intent.

The same behavioral model used to predict conversion can also detect bots, identify abuse, suppress wasted spend, and protect downstream systems. This is not a separate capability. It is a branch off the same trunk.

The Oak Tree Model of AI

Patrick uses a powerful metaphor to explain how real AI compounds.

Think of AI as an oak tree. The trunk is a core behavioral embedding, a live, continuously updated understanding of intent. The branches are the use cases: conversion prediction, bot detection, brand affinity, SKU preference, lifetime value, return likelihood.

It is the same understanding answering different questions.

This is why AI should not be treated as a feature. It is an asset that compounds across the business.

From Models to an AI “General Manager”

Perhaps the biggest shift Patrick describes is moving beyond the traditional model lifecycle of collect, train, deploy, and wait.

Instead, MetaRouter supports something closer to a real-time AI general manager: continuously learning, validating decisions against live traffic, comparing outcomes to benchmarks, acting cautiously, and promoting changes only when they are statistically better.

No archaeology.
No long delays.
No blind deployment.

Just intelligence that improves while the business is running.

Why MetaRouter Chose the Hard Path

Patrick is clear: this is not the easy way to do AI.

Models are becoming commodities. Infrastructure is not.

The real moat is not algorithms, but owning the first-party, real-time data flow where learning happens and action is still possible. That data is scarce, defensible, and its value only increases as everything else commoditises.

The Takeaway

AI is not magic.
It is not prompts.
And it is not dashboards.

It is systems that learn where decisions matter.

The future belongs to organzations that learn in real time, act while intent is forming, treat data flow as strategic infrastructure, and build intelligence into the first mile.

That is where AI stops being hype, and starts becoming leverage.

Read the Full Transcript:

Dom Burch: So it's a very on the fundamental podcast. What are you talking about? Why am I AI doesn't begin with algorithms at all? It begins with data flow. My guest is Patrick Harrington, head of AI and machine learning metal. Now, Patrick today has a fascinating nexus. He's not building AI demos, he's building intelligent infrastructure. So in this conversation, we're going to talk about what it really means to have a privileged position in the data flow, why bot detection is actually a data quality problem, and how AI becomes powerful only when it compounds across use cases. So if you're tired of AI hype and interested in where real capability is being built, this one's for you. Right, let's dive in. Patrick, welcome back to the First Month podcast.

Patrick Harrington: Great to see you as always.

Dom Burch: Now I can't believe it actually was the end of October, the last time we were together on the podcast. How much has happened in the last couple of months? But let's frame it, right? Let's frame this AI without hype. There's an enormous amount of noise around AI right now. We've just come off the back of CES. People are going to be at NRF next week. So, from your perspective, what do you think most people misunderstand about where AI capability actually comes from?

Patrick Harrington: Yeah, we're we're what, three and a half, four years into post-CHAGPT world, right? And with such, there's been consumer adoption percolating through, call it the early adoption curve into practitioners, into maybe folks who are late to the stage. It's becoming much more ubiquitous. And then you have the corporate uptake where there's more stringent policies, adoptions, data sovereignty, data governance policies. Where's my data? Is it being sent to models? Are the models coming to my data and my environment? And you know, the corporate leaders, boards have been beginning freeing up and diverting funds to invest in uh adoption of these techniques, but also starting to see how it affects how work is done. And you alluded to the past few months of you know the calendar Q4, so to speak, where a lot of corporations are wrapping up their fiscal year. And on a go forward basis, you know, the planning is done, the ship is pointed at the point on the horizon, and budgets are being deployed, and now work is be beginning post kind of holidays, start of the new year, and so forth. And so, you know, since October, what's really changed is you know, there's the continual arms race of the Gemini of the world, ChatBT, the Anthropics, which are really coming down to productivity amplification from an employee basis. Many of MetaRouters' customers, spanning retail, retail media networks, commerce media, and just general sites who are operating digital properties to interact with their end customers, call it consumers, are trying to not only have LLMs, but how do you develop a more dynamic experience that is learning and updating on the fly towards desired commercial outcomes? This is not something you can do in batch. It is not something that you can do offline. You need to be in the moment, in the conversation, not reading the mail after the postman has dropped it off and it was sent two weeks ago across the Atlantic between you and I. But in the moment, and only in the moment, can you impact the experience over millions of concurrent called consumers on your site? How do you speak to them like you would if they were your only customer and you knew them perfectly? So that then brings us to the flow of data the first mile and the type of AI machine learning capabilities that are utilized for understanding that intent in real time that is personalized to each session, each journey across any of the digital modalities. But how does that evolve towards getting closer or further to the desired commercial intent? And then how can you take the appropriate action to adjust that intent trajectory towards what you want? So that comes down to, I think, with a conversation we're gonna have around uh the data flow and why that is becoming mission critical in this new era of AI where algorithms are everywhere, it's easier to build. And when you diff that off, where your data, the latency, how you can intervene becomes a pretty interesting ingredient to the dish.

Dom Burch: So just before we get into the data flow, then let's just describe what is the status quo at the moment, right? Which I think we've discussed a few weeks ago, is almost like the way that people analyze data at the moment is a little bit like archaeology. It's like it's already old, it's already buried over there, and you have to kind of figuratively pick through the bones to figure out what happened in a time that's gone before to then try and predict the future, right? And there is still some science in that. There's still some clever marketing and intuition and e-commerce practices, right? That you can do a pretty good job. What we're talking about, though, is observing the flow of signals in real time and then acting on that by using what prediction based on effectively an LLM looking at all that information. I mean, it just might help people kind of wrap their head around why is this so different to what we already do? That seems to work all right, doesn't it?

Patrick Harrington: Like when you take that archaeology analogy to call it more batch offline systems, recommendation systems that have been built by, call it the classical SaaS internet public company players of the past 15-ish years. Um, they were built in an era of time where the distributed systems, the technology of moving data around and intervening was changing. It was easier, classic websites only, LLMs were barely a thing. AI modalities, you know, kind of were under the guise of chatbots, if that at the time, they were not that great. Um and that Rubicon has been crossed in terms of a transition point of not only new classes of models, but the commoditization of models. Many of these models now are converging, they're very similar, they're widely available and approachable from practitioners, especially with the developer compounding effects that you get from these AI developer tools that are you know evangelizing knowledge. It's easier to build a model, and therefore, it's not necessarily a differentiator. The differentiator becomes the data and the speed of that data to then derive value. And so we're kind of the archaeology example of like the rear view mirror of trying to reconstruct, you know, the trajectory of what happened. You can't necessarily happen what's in front of you. You're looking backwards. And so I think there's a good analog there, like you alluded to. And a lot of the capabilities that have been built that are powering LLMs are perfect for understanding a sequential consumer journey as they are interacting with a brand, a retailer, you know, someone selling goods and services, to understand that sequence events, like a sequence of words, to understand um not what sentence should come next, but how does this affect the likelihood to convert or to purchase a particular skew in a large catalog? And then what do you do with that knowledge to intervene and affect that experience while the plane is mid-flight, not when it's landed, when it's mid-flight.

Dom Burch: Now you've said AI doesn't start with models, it starts with data flow, right? So what do you mean by that in practical terms? And and what changes when you sit server side instead of client side?

Patrick Harrington: Yeah, in the era of litany of digital properties that occupy the internet, you know, the website, so to speak, there's the client side, the front end, and the back end. And client side is effectively what the consumer is interacting with or what a bot would see when interacting with the digital property. But you're limited. There's a lot of noise that gets kind of thrown into that interaction profile and the subsequent data being thrown off from it. The server side is really the back end. It's really kind of the ground truth, what the what the actual software of a retailer is rendering to the client. And then what happens, that ground truth is where Bermeta router plays the ball. It allows to get the bloat, the kind of the classic third-party tags, kind of leaking behavioral data to Facebook, to Twitter, to TikTok, you know, this is that. Um slows your website. But more importantly, as it pertains to this conversation, that is a very clean, very fast-moving data set anchored by identity. And those are critical ingredients to not only train machine learning models on the fly, we'll call that continuous training. But what's called an industry that many people are now hearing about, as it's more ubiquitous in the popular press, is inference. Inference is when you query a model. And even garbage in, garbage out only, you know, it applies to training these models, whether it be offline or an online training paradigm, but it's also critically important for inference. And that is where you can update in real time, if you have high-quality data, what is going on? How does that intent towards commercial intent, that likelihood to convert, update itself in real time and have it be accurate? And so that first mile, that flow of data, the quality of the data on the server side is really where meta router plays fall in what we have coined the first mile, because it's upstream to where classical data assets sit, whether it be a data breaks who are downstream, last mile, Snowflake last stream, a Salesforce Commerce Cloud, you name it, where a lot of those either offline or pseudo-online, they're still secondary to the first mile. And the first mile is where you can also then take action to affect the site experience in sub 10 milliseconds. And what we know about anyone who's been operating website, e-commerce, retail, time is of the essence to kind of use that phrase. And any every kind of uptick in 50, 100 milliseconds leads to a degradation of observe and report and then intervene. Because someone's already moved to the next ph or they've moved to their intent has shifted. And you're trying to triangulate a moving target versus in the moment, tracking it correctly, perfectly, and then intervening with that more accurate perspective and in aggregate over your millions of concurrent visitors every day, every day of the 365 days a year is where incremental growth happens. And that is what we are doing with our meta router first mile AI thesis is using it to support in the moment moving commercial outcomes, not observe and report, not just modeling, but using that to then affect the experience towards incremental business outcomes.

Dom Burch: And again, let's just put the perspective of what currently happens often with retailers is someone doing a load of stuff online, they're collecting a load of information, that goes somewhere else. Somebody analyzes that information and says, oh, Don was just about to buy that thing until he got distracted and didn't buy it and left the site or changed page or whatever. And he might have gone onto Facebook or Instagram. He might have gone onto another retailer's site. Oh, we can find his cookie somewhere, right? But oh, we better send him an email then. Now, what they don't know is I've already bought that item off Amazon or I've moved on, right, in the moment. So what you're describing is this kind of I think you've used the phrase privileged position in the data flow. Like that's a strong phrase, right? So what makes that position so powerful? And from a kind of machine learning point of view, and why does identity matter so much here then? You know, what signals do you get that others can't?

Patrick Harrington: You know, the flow of data, the latency and speed of data generated by an individual, a human or a non-human operating on behalf of an individual. And what I mean here is AI agent. The privileged position happens to be that server-side data that is where and and and just let's step back really quick, and so there's a concrete understanding. What do we mean here? The server side is telling the client side what to do, what items to show, what the add to cart button should look like, powering the search engine on the website, all of that stuff. And all of the data associated with every single thing that was done, shown, and then how it was interacted with is the behavioral data stream flowing through this privileged position because we are seeing it in the moment. We are not, we are upstream of publishing it into some Spark topology that Databricks is using seconds later, minutes later, or coalescing in aggregated Snowflake, or some Google Cloud storage bucket. In the moment, we can then go back to that server to then say, Hey, you need to render this, you need to do this, you need to do X, Y, and Z. Those X, Y, and Z actions are developed from learning what's called a policy, which is taking the understanding of that commercial intent representation of that journey in the moment, what they're going to do, and mapping it to what are the optimal actions that we should take. Discount. This person is possibly a movable individual in terms of their conversion profile. You might want to offer a little discount. That'll get them over the edge. Or these folks, they have incredible high intent. You don't need to burn barter tip by discounting. We know that in the moment. Or incredibly low intent, either boss or casual, we can quantify that. That leads to what then we do. And so this privileged position is really in the moment as the server is communicating client side back and forth. But you mentioned identity. The identity of the individual, that like numerical representation of Dom separate from Patrick allows us to differentiate, uh, call it the behavioral stream, the journeys that we're currently on. And that identity not only is relevant to understanding what does this prospect or existing customer, what do we know about them, um, et cetera, but that identity then maps to the corresponding identity in Facebook so they can be found or suppressed in Google, in TikTok, et cetera. That's meta router sync injector, the identity graph in a first-party context that allows us to kind of separate with high coverage, oh, we actually know these individuals. And that identity anchored by the data they're they're doing is a real critical ingredient in machine learning and statistics in AI because it allows you to not work in aggregating data together. You can now understand, oh, this stream of data is from Don, separate from Patrick, and not human readable PII. These are kind of arbitrary identifiers. But in the moment, in that privileged position of the first mile on the server side, we can now model intent, use that real-time representation that is being updated with every dwell, every interaction that we could see, and tell the server side what to do to affect that experience optimally. And so this is called reinforcement learning in the industry. It's what the self-driving cars are doing. Um it's a lot of what kind of like Gemini and ChatGPT and other things are doing. They're learning on the fly and updating kind of what that policy is. And it's hard to do. And in the context of affecting a dynamic experience towards a desired objective function, here being commercial outcomes or a reduction of churn, a real-time intervention to save the ball, so to speak. Um, not trying to call them after they've already churned and deleted their account or unsubscribed, but in the moment, this is how to really affect your business in aggregate, your gross profit, your top line revenue growth, the lifetime value. And it's a real privileged position because we move from observe and report in the rear view mirror to really having the hands in the wheel and affecting where we're in your drive.

Dom Burch: Let's get a bit more concrete then. When an event flows through MetaRouter, what's actually happening before it ever becomes AI?

Patrick Harrington: That's a really good point. Let's go back to an example of someone on some retail website, some e-commerce website. As they are, they land on the website, they may have clicked through some campaign, whether it be a social Instagram ad, whether it be a Google search, whether it be they just they know the brand already, they typed it into their browser and went to the homepage, and the journey begins. There is effectively pipes, so to speak, that MetaRotter provides our customers, in this example, the retailer, where every single event that the that individual is doing, from where they come from, what they start doing is a stream of certain messages, almost like sending an SMS or what's that message back and forth in real time through the pipes. And those messages are enriched with our identity capabilities that allow us to support our customers, in this example, the retailer, to understand what those identities are in the context of finding them on Instagram, on TikTok, et cetera, without having to share the behavioral stream of that SMS conversation with you and I, so to speak. Um those messages are flying through in a queue, so to speak, in that back and forth message. If you look at your iMessage, et cetera, those messages are also kind of passed into a parallel pipe where the AI happens, where we read through each identity that message and update, it goes into a model. There's a bunch of fancy math of taking the actual messages, device type is iOS versus Android, or IP addresses this and that, and what they do in the timestamps and all the fancy stuff. And using, you know, the mathematical techniques, we transform this human readable data object into a mathematical signal that then goes through these training models and outputs in real time with every new message, an updated propensity of conversion or affinity to a brand in the context of retail media, where that could get routed to the brand owned Facebook account, or it could lead to this person has purchased or is likely, highly likely to purchase, but didn't and bounce. That's where we need to bid aggressively or suppress. And so those messages flowing through that concurrent parallel. The pipe allows us to update the model that informs then what is the optimal action to take, whether it be an on-site experience or increasing your return on ad spend off-site. Instead of just sending everything, let's do it intelligently, let's do it surgically. Same way that we're talking about site optimization in the moment towards a desired commercial outcome. Well, let's apply the same technique as we route that data for advertising, paid media purposes elsewhere, commerce media, retail media, et cetera. So this parallel pipes is all happening in what's called a computational cluster. The meta router utilizes it's a bunch of horizontal computation happening. And the addendum of that parallel pipe to the existing kind of moving the messages through allows us to leverage all of the computational assets of every large cloud, from Google Cloud Platform GCP to Amazon's offering to Microsoft's offering, you name it, whether it'd be large-scale CPUs or GPUs, you name it. We have the arsenal, so to speak, at our disposal in the moment, in that privileged personal position to deploy whatever technique we need to understand accurately that commercial outcome, and then take the optimal action and learn over time what works, what doesn't in that parallel pipe. And that leads to incremental commercial outcomes, whether it be a higher return on ad spend, higher conversion rate, you know, suppression, you name it, return reduction. And when you think of the operating mechanics of these digital, these businesses that have a digital storefront, this is effectively how the business is run customer acquisition, retention, conversion optimization. And if you can optimize that in the moment, in this self-learning system, in this privileged position, that's a pretty exciting new development in industry right now.

Dom Burch: Now, I want to something else you said to me. I mean, normally when you say stuff, right? I'm in meetings, sometimes sat there in the background, right? I might not even have my camera on. And then Patrick, you start talking, and then somebody else jumps in, and then I'm like, oh my God, this conversation is just like blowing my head off. So something else that stood out to me yesterday was you got the same underlying learning that can support multiple use cases. So we talked a little bit about bot detection. You've talked about conversion understanding, prediction. Why is that reuse so important? And maybe let's even go to bot detection because without giving away the crown jewels, like there's some smart ways now to kind of go, oh, hang on a minute. You can start to learn what looks like a bot and what looks like a human.

Patrick Harrington: Going back to the client side and how it's noisy, and and increasingly bots are better looking like humans and better suppressing a lot of the classical bot detection software packages that tend to run client side. Um, what does this mean? I don't want to get too technical here, but going back to this clean behavioral privileged position on the back end of the server side, we've talked about these models of the concurrent message pipe that are updating and learning how behavior effectively adjusts a likelihood of conversion intact. Let's start there. And let's call that the trunk of the tree. I'm looking in my backyard right now, I'm in Colorado, giant Colorado blue spruce, there's the tree trunk, which is modeling behavior as it pertains to conversion. Well, that behavior signal, this mathematical artifact that is updated in real time, let's call them an embedding, is really valuable and capable of being reutilized for other tasks. It could be understanding, well, does this behavior correspond to a bot or not a bot? Or does this behavior align itself towards affinity of different brands or categories or individual SKUs of a large retailer's catalog? How does it correlate with lifetime value? A 12-month running called cumulative gross profit or likelihood to return some item. These n number of tasks you can think of as a branch out of a tree. And that core main vein, that tree trunk, feeds into all these other disparate models as a critical input. And so you have this tier of call it primary and secondary models on in this privileged position. They're all in the same privileged position, and they can all be utilized of where should I be sending that data? How do I affect that experience? Is this the bot or not bot? Bot being an example of what a particular task would be. And we're able to leverage the behavioral stream, how they dwell, the temporal signals, the differences between how quick it's moving through, or it doesn't look like it's behaving like a search. And like this has nothing to do with conversion patterns. And what we found is it's actually very accurate in detecting this to work in tandem with the classical techniques. But they're under siege from the increasingly sophisticated of AI bots. And so that's one example how this behavioral commercial intent feeds into other concurrent uh models that are running simultaneously on the same message stream that allows a customer to take actions in with surgical precision with whatever it is, whether it be the odd suppression, that could be a model, whether it be a bot detection is another, another, you know, price elasticity model, you name it. All of these are derived from the tree trunk. They're all growing off and continuously updating and learning in real time. And that's great and all models are very cool. I'm you know, classically trained statistician, but when you now take the optimal action to affect your desired commercial outcome, and you're continuously learning what that quote unquote policy is, that is where the rubber hits the road, and you now adjust the experience in that privileged sub-10 millisecond uh first mile, and you do such and kind of guide that journey towards incremental, desirable customer experience and subsequent desirable financial uh uptick.

Dom Burch: I like the conversation around bot detection because people can wrap their head around it, right? All these little robots, and these robots are kind of like they're just I mean, they're like a plague, they've gone across the world, they've just scooped up everything they can see, you know, they're learning, learning, learning, and they're getting better at hiding now. They're not, they don't even look like bots, they look more like a human than they ever did before, right? So bot detection is usually sold as a tool or a vendor problem, but the way you frame it is more of a data quality and behavior problem. Why is that?

Patrick Harrington: That's very well said. When you go back to those messages um flowing through the conversation pipe back and forth, what the person's doing, what we're showing, and what they're doing. Um that behavior, you know, we we see do folks convert, do they not? This and that. We see the time difference between websites of mouse movement or app usage or whatever digital modality that MetaRouter is implemented in. We see every single uh event in a very clean manner because we're how we're instrumented in the back end and the server side. Um that behavioral stream does not deal with ad blockers or sophisticated bots that are able to suppress different third-party calls out to other classical bot detection type companies that are instrumented in that manner. We see the mask has been removed, so to speak, in terms of the raw behavioral data. And we're able to utilize that and cross-reference of known bot patterns with this commercial kind of intent behavior of conversion versus non-conversion, and utilize that to really sift through the weeds and accurately separate it. And what does that mean if you're the customer? It means that you can better preserve and suppress ad spend towards these likely bots that are otherwise, you know, the ghost in the machine looking at itself, like it's it doesn't matter, it's an ad. And divert those resources to actual, you know, very, very likely humans, or affect the experience and the discounts accordingly. And so you're able to adjust the experience mid-flight. I think that's the key. You're able to have a sequential uh leverage the journey of the bot of the human and revise your understanding of what this is, not just from the first time they hit, but it compounds itself over that journey that is otherwise very hard to do.

Dom Burch: I'm just imagining it, so I love it. Because you're not just doing it once at the start and going, it's binary at the start, right? This is a bot or it's not a bot, and you've let through a load of bots because you've already made a decision that it's not a bot. And then everything that happens hence is like, well, it's a human, right? So we're gonna treat it like a human. We're gonna give it a discount, we're gonna try and get it over the line. And then also in our archaeology note, we're gonna then analyze that after the time and go, why are some of our customers not converting? And it was never a customer in the first place.

Patrick Harrington: Yeah, I mean, it would be like literally being a greeter at a physical store, and you walk in Dom, and the greeter's like, Dom, Dom looks like a human, or C3PL walks in and like this is a robot, right? It's that visual cue, you know, you haven't gonna say.

Dom Burch: Yeah, yeah, I like that. Okay, now the other thing you talked about is validating your models against industry benchmarks rather than replacing them, right? So, one, why is that important? And also, what do you learn when your model disagrees with an established system?

Patrick Harrington: That is a really good question. Normally, what you what you tend to do as a practitioner um is you want to understand when are these models behaving similarly and differently, and you're really concerned on okay, that's why are they behaving differently? What is it capturing? What type of signal, what type of statistical effect or pattern of behavior is causing that divergence, right? And over time, when you are validating a model, the classical call it life cycle of a data scientist, of a machine learning practitioner, call it circa from 2020 to 2022, has been a bunch of that data in the last mile is coalescing in some database. I mentioned like a snowflake, and that data is offline. But like those sessions are in the past, and they are retraining a statistical model, a machine learning model. They are looking at those differences, similarities, maybe iterating on new inputs to a model that try to capture that type of effect that maybe not being modeled or accounted for. And then they train the model, they do the you know, the testing and the validation, and this and that. And if it looks good, that model, meaning all the parameters, all the weight values, all the different um components that comprise these large statistical AI models, gets promoted to some server for inference, and you send messages at this at this model, you get some predicted value, and then what do you do with that? What's interestingly is the ability to affect and readjust that model continuously within this privileged position, such that it is not an offline days later online paradigm where you're always lagging. But you're able to now have almost an LLM supporting the commercial intent model in tandem without a human in the loop that is also supporting continual adjustments to the model and only then promoting a model when it is robust and sound and statistically better relative to live concurrent traffic. And that is a real interesting shift because it allows it allows one to effectively have a real-time digital general manager that is an AI general manager that is managing kind of the state of that model and improving it and only kind of deploying the new the new thing when it is robustly better and all the guardrails are in place and this and that. But it's not days later. And this is a real profound shift in the industry of what modern AIs and LLM orchestrated workflows are are doing. They're doing, you know, do this and that and this and that, but there's a bit more handling of curve bulbs and you know, and judgment and reasoning in the middle. There's only going to improve. And if so you if you have that color, that AI GM and you maintain an on-ramp to the the next best foundational model from open AI or cloud, and like you really don't care. It's a commodity, but you want an on-ramp such that it is always improving in conjunction with the behavioral model that is always improving as a result, your business is improving, is the desired goal.

Dom Burch: Okay, so this makes sense to me now. So you said AI shouldn't be treated as a feature, but as something that compounds. And I think what you've just described is unpack that for me. So though, so that's a thank you because the penny has dropped. I know it takes time with me, Patrick, but eventually the penny drops. I like that. Um, I want to finish up on like there's a hard way to do this in an easy way, right? There's clearly an easy way to do AI in inverted commas and a harder way. Why have you and MetaRouter chosen the harder path? And and and where do you think shortcuts tend to show up later?

Patrick Harrington: I've been doing this for 20 years, and my doctorate um is in the mathematical framing of AI. And so there's a very deep understanding of the moving parts associated with AI and machine learning problems. How does that manifest in a in infrastructure, the architecture of that infrastructure and systems, the product that manifests from it, and then ultimately how do they all come together? There's a lot of been doing this for a long time, and it's something I'm deeply vested in intellectually. Um the commoditization of model development is a real thing. It used to be um more siloed, and therefore you'd have to go hire some MIT grad to do the best and the best of the best. The barrier to quality is getting lower. And the time it takes to get to quality is shrinking. And all of the LLMs that I mentioned to in the on-ramp analogy, um, you just want the best. Um, it's becoming more of a commodity. And so when you when you kind of remove that part of the equation, and what else is a critical ingredient to getting to that desired business outcome becomes the data. And when all these AIs are subsumed on training of books corpuses and everything that's on the public internet and everything they can get their hands on, they're not getting this, they're not getting the behavioral data that a retailer would have. They're not getting the individual anchored by identity of what they're doing. They may get the static contact when they crawl, but they are not getting that customer journey anchored by what happens from a commercial standpoint. And so this privileged first mile, and what's really profoundly interesting from my perspective at MetaRouter is that first mile with behavior, with clean identity anchored by it, is the unique value add to the equation. And so we're able to support very interesting public retailers, you name it, um, by helping them understand that critical ingredient and just how valuable it is. Because the others don't have it's no longer a commodity, it's a scarce asset that they own, they're in control of within their data governance profile to do what they want to do with it. And others don't have it. And I think that's the key. And that value of that scarce asset is only going to increase over time as everything else around it gets commoditized. I guess I'll maybe conclude with this. You know, we we we're now in 2026. If I go back to the start of the millennium, 2000, 26 years ago, early days of the internet, and then commerce on the internet, data being thrown off in mass, internet giants, massive adoption, um, different devices to interact with the internet and the goods and services. The data that has been thrown off has now fed into improving those experiences. The AI modeling capabilities have been able to leverage all of the data assets thrown out by humanity and everything being published on the internet to build what we now interact with today, like via the chat GPTs of the world, Gemini, Google Gemini, and you know, the cloud models, etc., that are effectively downstream of what humans have put on the internet. Um, and we're a few years into that journey. We're now learning on the other side of that hurdle, that hurdle late 22, of what's unique, what's the same, what's different. And then what are the new modalities, what's up, interaction patterns. And that is really where we are supporting our existing customers, prospective customers, to how to navigate that journey and just how important real-time in the moment experience optimization is using intelligence on their unique first-party data stream to make sure they are defensible. So they want to manage kind of the kinks in the armor, but also on the offensive, on front foot to really command incremental market share acquisition, optimize customer experience, as there are new tools in the toolbox that did not exist. Um, and there's a different landscape of how to think through where your data is, the scale of time that it's moving around, where it's going, where it should not be going, that it may be going, how it, you know, how you can help. And so the world is changing and it's going to accelerate over the next few years. Um, and so I I it's a really fun place to support very interesting businesses because we see the proofs of the pudding, we see incremental value that we're able to offer. And it's not just, oh, this is a cool model, it does X, Y, and Z. It's it's a cool model, it's accurate, and then it does something to affect the experience towards better commercial outcomes. And that is what's changing.

Dom Burch: Brilliant. Patrick, it's always an honor, it's a privilege to hang out with you and just to hear you make sense of this world. And also, you know, we started this podcast by saying there's a lot of AI hype out there, and just to sort of ground it in the reality, right, has just been fantastic. But for the time being, thank you so much for coming on the first mile podcast.

Patrick Harrington: So it's a pleasure to see.

Dom Burch: So there we go. As Patrick just said, the world is changing and it's a fun place to be right now to support our customers through this new era. We talked a little bit about how the old women are working is a bit like archaeology, right? Digging through the bones of what's already happened batch by batch, hoping the past can tell us something useful about the future. And that makes sense. I mean, most of the SAS tools we rely on were built 10 or 15 years ago for a very different era, batch data, long delays, models trained offline, promoted to production, and then, well, then you wait, right? And what Patrick is describing is something fundamentally different, a privileged position in the data flow where learning happens as intent is revealed. Not days later, where AI isn't just predicting, but mapping understanding directly to action. Do we offer a discount? Do we suppress an ad? Do we reroute a journey, or do we do nothing at all? And what makes that possible is this idea of a parallel pipe. So that's data flowing in real time to computation. Clusters that can tap into the full arsenal of modern compute, whether that's Google, Amazon, Microsoft, CPUs, GPUs, whatever's needed in the moment. And at the heart of it is really powerful metaphor that Patrick used, the oak tree, right? The trunk is the core behavioral model, the embedding. That's where understanding lives. And from that trunk, branches grow. Bot detection, brand affinity, SKU preference, likelihood to return an item, lifetime value. Same understanding, different questions. And that matters, especially as bots get better and better at being more human-like, more deceptive, harder to spot. So when you treat bot detection as a vendor problem, you're always reacting. And when you treat it as a data quality and behavior problem, you can act surgically. Suppress here, divert there, adapt mid-flight, and continuously update what you believe to be true. And maybe the biggest shift of all is this. We're moving beyond the old life cycle of collect, warehouse, retrain, deploy weight into something much closer to real-time AI general manager. I love that concept. One that learns continuously, acts cautiously, and only promotes decisions when they're statistically reliable. So no lag, no archaeology, just learning in the first mile where intent is formed and outcomes can still be shaped. As ever, Patrick, thank you for such a clear, grounded take on where AI is actually heading. And thank you for listening. If you've enjoyed the first mile, please leave us a review, share it with a colleague, and come back.