When AI agents shop - and leave your online store behind

Author

Klaus

Published

December 1, 2025

Dreams of the future? - No, already in use

Anyone who assumes that Luke is a character from the distant future is very much mistaken. On September 30, Open AI launched theInstant Checkout and the Agentic Commerce Protocolpresented. While the shopping cart is a specific integration with Stripe and the e-commerce giant Shopify, the protocol is aimed at all store operators.

More than 700 million people already use ChatGPT every week to get support in their everyday lives - now they can also use it to shop directly. With the newInstant Checkout, developed together withStripe and initially only available for US users, products fromEtsy merchants directly in the chat. Soon there will also beOne million Shopify storesincluding brands such as Glossier, SKIMS and Spanx. The launch will initially include single purchases and multiple orders. Other regions with strong purchasing power, such as Germany or France, will not be long in coming - Open AI is talking about next year.

The technical basis is theAgentic Commerce Protocol, an openStandard for AI-supported trading, which OpenAI has developed together with Stripe and trading partners and whichnow as open source is available. The aim is to create a seamless connection between consultation and purchase - users can switch from conversation to payment with just a few clicks, while retailers retain full control over payments and customer data.

a flow diagram showing the process of powering instant checkout in chatgpt .

Buying machines differently

Customers of large platforms such as Shopify can look forward to direct integration via the e-commerce platform. Admittedly, it is not yet an AI agent that makes purchases autonomously. These are suggestions that have to be actively confirmed by the user after a search in Chat GPT. So for the time being, this is a great test and learning scenario for the next stage in agentic commerce, the actual agents.

However, this fundamentally changes the playing field for theMarketing andEcommerce Manager der Merchants. An AI agent does not click on colorful sliders, home pages or brand worlds. For him, there are no campaign images, no lookbooks, no emotional headlines. Mr. Smith reads fields:Price, size, color, material, delivery time, stock. It compares, sorts, compares. It has three tasks: search, understand, decide.

That sounds trivial to us humans. For machines, it is a Herculean task, because they have toBillions of data points and they need order, clear rules and unambiguous structures.

This is exactly where you can see how differently stores are built. Shopify, the best-known construction kit in e-commerce, has democratized online retail. Millions of retailers worldwide, simple operation, themes, apps, checkout in no time. ButShopify wasdesigned for people. Everything is geared towards beautiful layouts, fast clicks and smooth orders.Machines find this difficult.

When merchants needed more information, Shopify introducedMetafields on, laterMetaobjects. This allowed them to enter additional information such as "Material: cotton", "Care: machine wash", "Package size: 6 pieces".Many used the creative- as a CMS replacement, for storytelling, for marketing texts. It worked for humans. For machines, this is where the guesswork began.

Because retailers write what they want. One notes "rot", the next "red", the third "red". It's clear to us: all red. Themachine, however, sees three different values. Or the weight: "0.5 kg", "500 grams", "half a kilo". The same for us, three inconsistent specifications for the agent.

Shopify does offer possibilities,To typify meta fields - so clearly specify whether something is a number, a selection or a text. Butin practice this disciplinerarely observed. Retailers and agencies use metafields like a tinkering box, apps create their own structures. As a rule, the metafields are also used to place content on the product detail page and make it look different in the design. The reason for this is that the product only has one text field to describe the content. Metafields make it relatively easy to structure content according to the design specifications.

The result: millions of stores,Millions of different key-value pairs. Barely visible to people, a minefield for machines.

Back to Lukas' agent. He finds two stores with the T-shirts he wants. The first store says: "Color: black", "Size: L", "Material: cotton". Everything is clear. In the second store: "Color: black (dark)", "Size: large", "Material: 100% cotton - fine". For Lukas, both offers would be equally understandable. But the agent has to think. Does "large" really mean L? Is "black (dark)" the same as "black"? Can "100% cotton - fine" be classified in the same material category as "cotton"?

This Interpretations cost computing time and harbor errors. An agent who has to make millions of such decisions in seconds prefers stores that make his work easy.If the price is the same, the one with clearer data wins. The other one falls out - not because he is worse, but because he speaks unclearly.

Machines are not arbitrarily strict. They are built that way. Even theAI agents themselves are API-first. They speak interfaces, not websites. A travel assistant that books flights does not access airline websites, but the Amadeus API. A shopping agent that reorders diapers uses the store's product API, the stock levels and the shipping API. Machines talk to machines.And they expect the other side to speak the same language. Anyone who does not do this is simply ignored.

Seconds that decide on trust

A few days later, Lukas is sitting in his vacation apartment by the sea. The T-shirts have been delivered, but now there's another problem: the coffee machine. The rickety model in the apartment makes more noise than coffee and the machine at home is also getting on in years. So another order for Mr. Smith: "Get me a coffee machine, maximum 200 euros, delivery within two days."

Two stores report suitable models. Store A gives clear data: Stock 12, delivery in two days, price 179.99 euros. Store B reports: "In stock". Both sound good to Lukas. For the machine, it's like night and day. "In stock" - thecould be called one, or a hundred. The agent doesn't know. And he doesn't know the retailer's rules: does "in stock" really mean available, or does the store allow orders with zero stock in the hope of being able to deliver soon?

Shopify has created interfaces for this. Via theInventoryLevel-API stocks can be queried per location, differentiated according to "on hand" -so physically in the warehouse- and "available", i.e. the, what can actually be sold after deducting reservations. About theFulfillment-API orders, items being processed or partial deliveries can also be tracked. Even the setting as to whether a retailer allows "overselling", i.e. accepts purchases with zero stock, can be queried via the API.

In theory, this would allow AI agents to see very precisely what is available. But thePractice is much more chaotic. Some retailers do not maintain their warehouse locations consistently, bundle stocks in a single warehouse oruse apps that superimpose their own logic on the data. Some enter "on hand" and "available" cleanly, others only enter the minimum. In the end, the machine sees aPatchwork: exact figures here, vague information there, contradictory information elsewhere.

It becomes particularly tricky whenBundles product sets such as a three-pack of T-shirts or a care range consisting of shampoo, conditioner and shower gel.Shopify has now developed aown appwhich works cleanly with the inventory data: If a bundle is sold, it automatically deducts the stocks of the individual products. This is comparatively transparent for agents because the logic is directly visible in the API.

But many retailers useThird-party apps. They often work with so-calledShadow SKUs - Placeholder products that appear in the catalog as a bundle, but are managed separately from the real inventory. TheBundle product hat then aown stock valuewhich is only loosely linked to the individual items. If a bundle is not synchronized regularly, it is considered "available" even though a part has long been missing. No drama for humans - an email saying "unfortunately sold out" is enough. For machines, however, it's fatal: theAgent relies on availability, fails the customer - and rates the store lower in future.

And even that is often only half the truth. Because in many storage systems, especially inSAPproducts consist not only of "one piece", but ofParts, also Parts lists for individual parts. A shirt, for example, is made up of fabric, buttons, labels and packaging. If there are only 40 sets of buttons left, there are in fact only 40 shirts - regardless of whether 100 are displayed in the store. Invisible to people, a stumbling block for the machine.

In the warehouse, the items are usually listed under different names than in the store. Online, the shirt is simply called "Business Shirt Slim Fit"; in logistics, it consists of an anonymous number for bolts of fabric, eight button SKUs, a label and a cardboard box. As long as these worlds are not properly mapped, the calculation is never correct. The store shows 50 shirts available, but the warehouse can only deliver 40. For customers, this is an email with a delay. For a machine, however, it's a breach of trust.

API-first architectures take a more structured approach to this level. They allow relationships between store products and warehouse parts to be explicitly maintained in the interface. A product not only refers to its visible SKU, but also to the list of required parts.This creates a realistic picture for the agentShirt = fabric + buttons + label + box. If there is a shortage of buttons, the machine knows that only 40 are really available - and decides accordingly.

It is precisely this clarity that is decisive when things have to be done quickly - for example, in the case of aSneaker dropwhere coveted models are sold out in seconds. A person is annoyed when the shoe disappears during checkout. A machine sorts out stores that allow such uncertainties.

Invisible bouncers

The next morning, Lukas wants to explore the nearby reef with its crystal-clear water. But he doesn't have any snorkeling equipment. "Get me a size 43 snorkeling mask and fins, delivery by tomorrow." The agent understands, starts looking - and encounters another obstacle.

Platforms like Shopify do not allow an infinite number of requests per second. It would be like running into a department store and bombarding a hundred salespeople with questions at the same time. That's why there areRate Limits - invisible bouncerswhich determine how many questions are allowed per time.

These delays are hardly noticeable for humans, but they are for machines that compare millions of data points. Shopify sets clear rules here:The limits are strict but predictable. With GraphQL, several data points can be loaded in one go, which increases efficiency. The platform thus remains stable, even when hundreds of thousands of stores are running simultaneously.

Shopify also works withBurst capacities - More requests may be made in the short term as long as the total volume levels off again within certain time frames. For retailers, this meansShort-term load peaks are possiblebut not a permanent high load. From the perspective of an AI agent, this is a double-edged sword: on the one hand, it prevents a store from "closing down" completely, but on the other hand, complex queries can be slowed down in the middle of the process.

There is also the issue ofCaching. To be able to serve millions of merchants at the same time, Shopify saves many responses in between. For people, it doesn't matter if a stock is a few minutes old - nobody notices the difference.For machineswho decide in real time, it meansa risk. An AI agent could still "see" a snorkel mask that has in fact long since been sold out. API-first systems make a clearer distinction here:Search queries canfrom fast, cached data be served, forTransactions but there are unadulterated interfaces with theexact stock level. In addition, there are event streams that report changes immediately without the machine having to constantly ask.

But this is precisely where the limit lies.Everyone gets the same limits -the small store around the corner as well asthe big brand with sales in the millions. This is fair for the platform operator. For an AI agent who works closely with a large retailer, it is a disadvantage.

Shopify is aware of this tension and is investing:GraphQL instead of REST, strongerTyping of meta fieldsnewCatalog APIs,Knowledge Bases. Important steps. But the historical burden remains: Millions of stores that have developed their own structures over the years.

API-first architectures are more flexible here. It is not the platform that has the upper hand here, but the retailer. They decide how much a partner agent is allowed to see and how often they can ask. Atrustworthy agent receives generous contingentsone less uncertain. Traders cangranular determine:Prices yes, stocks yes, customer data no. Shopify also differentiates between plans, for example with Plus, but many small and medium-sized merchants remain in the corset of standard limits. In API-first systems, on the other hand, it is theirown rules.

This remains invisible to Lukas. The next morning, the fins and snorkeling mask are at the reception desk. But in the background, his agent has decided which store he trusts - not just because of the price, but because he knew that his questions would be answered reliably.

The suitcase and quiet reliability

The vacation is coming to an end, the sun has left its mark and Lukas is more relaxed than he has been for a long time. But while packing, he realizes that his old suitcase has finally had its day. The zipper is jammed, a roll is squeaking and the seams look like they won't survive the next trip. So another order: "Find me a sturdy hard-shell suitcase, 70 liters capacity, four wheels, delivery by Thursday". His flight leaves on Friday.

The agent searches, compares. For Lukas, this is just another convenient service. But in the background, another question becomes crucial: can theAgent rely on the fact that theDatathat he calls up today, alsotomorrow in the same form present?

Machines are sensitive when it comes toReliability goes. A human recognizes when a store changes its navigation or new fields appear on a product page. We simply click differently, we adapt. A machine can't do that. It relies on fields, interfaces and meanings remaining stable. If an attribute is renamed, deleted without notice or cast in a different format, it is as if the dictionary has been knocked out of an agent's hand.

Shopify has invested in this in recent years: interface changes are versioned, developers know when old endpoints will be shut down, there are changelogs and announcements. However, thethe same for everyone - small retailers and global brands.Differences only arise via the plans, for examplemit Shopify Pluswhich offers additional support and greater stability. Governance in the narrower sense - i.e. fine-grained control over which data remains accessible for how long and in what form - is not in the hands of the retailers, but in those of the platform.

The situation is different in API-first systems.There are interface contracts, that retailers can actively control. They decide how versions are maintained, which fields run in parallel and for how long, and which partner agents receive early access to new structures.Governance is not platform policybut a tool in the hands of those who own the data. For agents, this means that they can plan for the long term and rely on a "volume" field actually remaining "volume" - and not suddenly being called "size" tomorrow.

This is crucial for Lukas' suitcase. The agent knows that "70 liters" means the same thing in all systems - not "medium size", not "suitable for check-in", not "suitable for a two-week vacation", but an exact number. And he knows that he can rely on it because the data model is not changed overnight.

Lukas gets the suitcase, delivered on time as promised. He is delighted with the new lightness, the wheels run smoothly, the zipper holds. He has no idea that the real achievement was invisible: the quiet reliability of the interfaces that made his agent's work possible in the first place.

Homecoming and the quiet lesson

The vacation is over, the plane lands on time, Lukas pushes his new suitcase across the shiny floor of the terminal. Once home, he folds the T-shirts he ordered, stows the sunscreen, boxer shorts and socks in the wardrobe and the snorkeling equipment in the cellar. Everything has arrived, on time, without stress.

For him, it was convenience, almost luxury. A few spoken sentences and things appeared as if by magic. For the retailers where his orders ended up, it was a competition that wasn't decided in colorful banners, but in the clarity of their data.

Because the truth is: Lukas' agent had a choice at every intersection. He was able to distinguish between stores that offered the same price, the same products and the same delivery times. It was the little things that made the difference - whether a size was marked as "L" or "large", whether a stock number was available or a vague "in stock", whether sun cream was labeled "SPF 50" or hidden in a free text between advertising promises.

AI agents themselves are API-first. They were not built to browse websites or interpret images, but to talk to interfaces. ATravel agent asks the Amadeus APIA shopping agent uses the product and inventory APIs of a store. They expect the same language, the same logic, the same reliability. If this is missing, they move on - to the next store that responds more consistently. This is precisely why API-first systems have a natural advantage: they share the language of the agents.

Agencies that specialize purely in Shopify themes may claim that it doesn't matter whether the agent talks to the Shopify API or to an API-first system. An interface is an interface. However, the difference lies not in the existence of an API, but in its design.Shopify has itsAPI retrofitted over a systemthat was originally designed for humans. This leads to variance, free text, shadow SKUs and app logic. An API-first architecture, on the other hand, was designed from the outset so that machines are at home there - clearly structured fields, uniform relations, less room for interpretation errors.

This does not mean that Shopify is unsuitable. A merchant who is disciplined about typing their meta fields, maintaining bundles cleanly and keeping stock consistent can be just as agent-friendly. But they have to do it consciously.API-first systems make it easierbecause the structure enforces discipline. For agents, this is the difference between a conversation in their native language and a conversation with someone who only speaks the language in a broken form.

Shopify has for itMany tools It strives for structure and stability, and yet the legacy of freedom remains: millions of merchants, millions of variants, millions of meta fields that are filled differently from app to app. With new building blocks such asCatalog APIsand theKnowledge Base Shopify wants to create more order. But the reality is thatevery retailerover years own conventions has developed. Invisible to humans, a daily struggle for uniformity for machines.

Lukas doesn't think about any of this as he throws his vacation laundry into the machine. He has taken it for granted that he no longer has to click, compare and filter. But in the background, Mr. Smith has quietly and secretly decided which store he trusts.

It is thisSilent upheavalthat marketing and e-commerce managers need to take seriously. People buy stories, machines buy data. And the clearer, more consistent and reliable this data is, the greater the chance that a store will end up in the shopping cart of the future.

Five signals that AI agents follow

Konsistenz - Fields and values mean the same thing everywhere, without contradictions.
Actuality - Stocks and prices are fresh, not from outdated caches.
Predictability - Rules such as reservations or advance orders are clearly documented.
Governance - Interfaces do not change arbitrarily, but according to plan, with notice.
Access - Partner agents get fair access instead of failing because of rigid barriers.