How Meta Builds Its AI Tells You How to Run Your Ads

What is happening

Meta published a detailed breakdown of how they scale the building and testing of their most advanced AI models. The piece was written for researchers and engineers. But the core argument inside it describes a problem that performance marketers face at a much smaller scale every day.

The finding: at the scale Meta operates, the way you test a system matters as much as the system itself. Getting the evaluation right, catching failures fast, and being able to iterate without breaking what was already working is as hard as building the model in the first place. The teams that move fastest are not the ones with the biggest models. They are the ones with the most rigorous testing infrastructure.

What I learned from this

Most ad accounts are run the opposite way. Creative is produced and launched. Results come back weeks later. Changes are made based on incomplete data. The feedback loop is too slow to learn anything reliable, and because multiple things change at once, the account never really knows why something worked.

I have run accounts with weekly creative tests and a defined process for what counts as a result worth acting on. Those accounts consistently outperform accounts without a testing culture, not because the creative is always better but because the learning is faster. After two years of weekly tests, the accumulated knowledge about what works for a specific audience is worth more than any individual campaign insight.

What Meta’s engineers taught me is to think about testing as infrastructure, not as an activity. The question is not “should we test this creative” but “do we have a reliable system for generating and acting on learning every week?” The second question is harder to answer yes to. Most teams do not have that system.

The three things a real test requires: a hypothesis before you run it (not “let’s try video” but “we believe video outperforms static for this audience at awareness stage because the consideration is high”), one variable changed at a time, and a definition of what result would cause you to change your approach. Most teams skip all three and read results selectively based on what they hoped to see.

Set up a weekly creative testing cycle. It does not need to be complicated. One new creative element tested each week against the current control. A defined time period and a defined threshold for what counts as a win.

Write the hypothesis before you launch. It takes five minutes and it forces you to be honest about what you are actually trying to learn rather than what you are hoping to confirm.

Change one thing at a time. If you change the creative, the audience, and the landing page simultaneously and results improve, you have not learned anything. You have just added noise.

Document what you learn. Not in a report that gets filed. In a running record that anyone working on the account can refer to. After three months of doing this consistently, you will have genuine knowledge about your audience that your competitors who are testing chaotically do not have.

The compounding is real. A testing culture built over two years is a competitive asset that no budget increase can replicate.

What is happening

What I learned from this

What I recommend for your business