Anthropic introduced the Model Context Protocol in late 2024. We shipped our production travel MCP server in 2025, the same year the ChatGPT apps surface opened, while the protocol was still revising from one release to the next. We were not the only ones reading the spec. We were one of the only ones running real reservation traffic through it that early.
That timing turned out to matter more than we expected. Not because of any first mover marketing story, though that is real. It mattered because running live traffic through a still evolving protocol forced us to make calls that pure design could not prepare us for. Retry semantics under ambiguous error codes. Rate limit headers nobody had standardized yet. Tool schemas that changed between agent versions. Every one of those tradeoffs shaped the product we ship today.
This post is a field report. What we got right. What we got wrong. And what we would tell another team starting today.
The bet
The internal pitch was short. Travel was going to be the first commerce vertical where AI agents needed to transact. The agent layer was moving faster than any reservation API on the market. If we built the MCP surface before anyone asked us to, we would be the default when ChatGPT and Claude decided to ship travel.
The risk was real. Shipping against an unfinished spec meant guessing at details that could break on the next protocol version. Every partner we onboarded in those first few months was a bet that the MCP spec would settle close to where we guessed it would land.
The agent layer was moving faster than any reservation API on the market. If we built the MCP surface first, we would be the default when ChatGPT and Claude shipped travel. Internal pitch memo · September 2024
What paid off
Three decisions aged well.
Tool granularity that matched agent behavior, not REST conventions. We had an existing REST API. The easy path would have been to wrap it in MCP. Instead we studied how agents actually called our sandbox during the first pilot weeks. Agents did not want a monolithic book_hotel call. They wanted search_properties, get_rates, check_availability, create_booking, confirm_booking, each atomic, each cheap, each interruptible. Five tools for one transaction felt wrong to anyone coming from REST. It turned out to be exactly right for how reasoning models actually plan.
Two phase commit for reservations. Splitting create_booking from confirm_booking added a round trip. It also let the agent present the reservation details to the user for approval before money moved. That pattern is standard now. When we shipped it, it was a hunch. The first time we saw a Claude agent generate a confirmation summary, wait for the user to say yes, then fire the confirm call, we knew the extra latency was worth it.
Credits refunded on outcomes, not metered on calls. Every other agent API we looked at priced per call with no refund path. That punishes agents for the exploratory behavior that makes reasoning models useful. AEC credits refunded on completed reservations inverted the incentive. Agents that close pay nothing for infrastructure. Agents that browse without converting pay for the compute they used. Partners loved it. More importantly, it shaped the way agents called our endpoint. We saw fewer ping pong retry loops and more deliberate multi step plans.
What we got wrong
We got three things wrong that cost us real partner time.
Our first rate limiter was too aggressive. We set a 50 requests per minute ceiling on the authenticated MCP endpoint because we were scared of agents looping infinitely. Within two weeks, our best partner was hitting the limit during normal usage because their agent was running parallel searches across destinations. We raised the limit, added a burst allowance, and published a Retry-After header that actually meant something. Lesson: agents are not human users. They do not self throttle. They also do not loop infinitely if the tool contract is good.
We under specified error codes. The first MCP error envelope we shipped had five codes. By month three, we had partners emailing us asking what specific errors meant. Every one of those emails was a missed signal the agent should have handled in code. We now have 15+ custom error codes under the JSON RPC envelope, each documented with a recovery hint. Every new code we add has to ship with the recovery path an agent should take.
We shipped webhooks as an afterthought. We assumed agents would poll. Some do. Many rely on the calling partner's infrastructure to get async notifications about reservation state changes. Webhooks were a version two add. They should have been version one. Retrofitting them into existing integrations took six months longer than building them first would have.
Principle
Every error code needs a documented recovery path. Every limit needs a retry header. Every async state change needs a webhook. Agents cannot ask follow up questions. The protocol has to answer them up front.
What we would do differently
If we were starting today, with the MCP spec stable and the ecosystem mature, the big change would be to move faster on agent identity. Our POST /agents/register endpoint shipped in year two. It should have shipped on day one. We spent our first year authenticating agents as if they were partners, which meant partners were proxying agent traffic through their own keys. That worked. It also meant we had weaker signal about what the actual agent was doing, what principal it was acting for, and whether a spend cap had been agreed.
Machine only registration is the foundation for everything else we want to do: spend cap enforcement, principal level audit logs, revocation on user logout. If you are building any kind of agent facing API, get the identity layer right before you ship the first tool.
A minimal registration payload
POST · agents/registermachine only
# The four fields that matter
{
"agent_id": "my-agent-1.0",
"principal": "user_abc123",
"pre_auth": {
"max_usd": 2500,
"scope": ["hotel"],
"expires_at": "2026-12-31T00:00:00Z"
},
"aec_pool": 500.00
}
Agent ID identifies the software. Principal identifies the human. Pre auth is the contract between them. AEC pool is the compute budget. Four fields. Everything else can be added later without breaking existing integrations.
The takeaway
Shipping against an unfinished protocol was uncomfortable. Every change from the spec committee meant a review cycle. Every partner onboarded during the early window was a relationship that required more maintenance than it should have. But we learned things nobody else could have learned at the same time, because nobody else was running production traffic through the protocol at our scale while it was still moving.
If the second wave of MCP adopters want to beat us to the next inflection, they will not do it by shipping the same thing faster. They will do it by shipping the thing nobody else is shipping yet. For us, that meant treating agents as first class users from the first day we took their traffic seriously. The infrastructure follows the thesis. If the thesis is right, the infrastructure compounds.
We are a year in. The ecosystem is real. The protocol keeps moving. And we are still early.