Infrastructure

Traditional Hosting Is Broken for AI Agents

VMs, containers, serverless — none of it was designed for AI agents. Here's why every existing hosting model fails, and what agent-native infrastructure actually looks like.

Maritime Team·March 11, 20268 min read

The cloud was built for web apps. Stateless HTTP handlers that receive a request, query a database, return HTML or JSON, and forget everything. Thirty years of infrastructure innovation has optimized for this one pattern.

AI agents are not web apps.

They maintain state across interactions. They execute multi-step workflows that run for minutes, not milliseconds. They call external APIs mid-execution and wait for responses. They hold conversation history, tool configurations, and embeddings in memory. They're stateful, long-running, bursty, and idle most of the time.

Every mainstream hosting option was designed for something else. And when you shove an AI agent into infrastructure built for web apps, the result is predictable: you overpay, you over-engineer, and you still end up with a fragile deployment.

VMs: Paying for an Empty Room

The simplest approach — spin up an EC2 instance, run your agent. It works, but the economics are brutal.

A t3.medium costs ~$30/month. Your agent handles maybe 100 requests per day, each taking 30 seconds. That's 50 minutes of actual work out of 1,440 minutes in a day — 3.5% utilization. You're renting a room and leaving it empty 96% of the time.

Scale to 10 agents and it's $300/month of mostly idle infrastructure. Scale to 50 and you're spending $1,500/month — real money for what amounts to a collection of sleeping processes.

And you're responsible for everything: OS patching, security updates, monitoring, restarts on crash, scaling when traffic spikes. The operational overhead alone can consume a full engineer's time.

Containers and Kubernetes: Solving the Wrong Problem

Kubernetes was built for orchestrating microservices — small, stateless, horizontally scalable units. AI agents are the opposite of that.

Running agents on K8s means:

Over-provisioned clusters — You need enough capacity for peak load, which means paying for resources that sit idle during off-peak
Mismatched scaling model — K8s scales by adding replicas. Agents are typically single-instance and stateful. Horizontal scaling doesn't map naturally
Complexity tax — ConfigMaps, Deployments, Services, Ingress, PersistentVolumeClaims, resource requests and limits. All of this to run a Python script that wakes up 100 times a day
State management hacks — K8s assumes statelessness. Persistent agent state requires either external databases (adding latency and complexity) or StatefulSets (adding operational burden)

You end up fighting the platform instead of working with it. Teams report spending more time on Kubernetes YAML than on agent logic.

Serverless: Close but Fundamentally Wrong

Lambda and Cloud Functions seem like a natural fit — pay per invocation, scale to zero, no infrastructure management. But they were designed for stateless, short-lived functions. AI agents break every assumption:

Cold starts are devastating — Loading ML libraries, model weights, and vector stores takes 10-30 seconds. Your user is staring at a spinner while PyTorch initializes
Execution time limits — Lambda caps at 15 minutes. Complex agent workflows — research tasks, multi-step reasoning, code generation — regularly exceed this
No persistent state — Every invocation starts from scratch. Conversation history? Gone. Tool configurations? Reloaded. Embeddings? Recomputed. You end up bolting on Redis or DynamoDB to fake statefulness
Memory constraints — Agents with large dependencies and model weights can easily exceed Lambda's 10GB memory limit
Cost at scale — Serverless pricing is per-millisecond of compute. A single agent workflow running for 60 seconds at 1GB memory costs ~$0.001. Sounds cheap until your agent runs thousands of times per day with large memory footprints

The serverless promise of "pay only for what you use" breaks down when your workload doesn't match the serverless execution model.

PaaS (Heroku, Railway, Render): Better DX, Same Problems

Platform-as-a-Service providers offer great developer experience — git push to deploy, automatic SSL, managed databases. But under the hood, you're still running always-on containers:

Always-on billing — Even on sleep-capable tiers, wake times are 10-30 seconds. Not viable for agent endpoints that need sub-second response
No agent primitives — No built-in concept of triggers, webhooks, or invocation-based lifecycle. You build all of this yourself
Generic compute — The same runtime serves web apps, APIs, background workers, and agents. No optimization for the agent execution pattern
Scaling limitations — Most PaaS providers scale horizontally by adding instances. For stateful agents, this means sticky sessions and distributed state — problems you shouldn't have to solve

What Agents Actually Need

AI agents have a specific runtime profile that no existing platform addresses:

Stateful sleep/wake — Suspend the full agent state to cold storage when idle, restore it in under 2 seconds when triggered. Not a cold start from scratch — a warm restore with memory, context, and tool state intact

Event-driven activation — Agents should wake in response to webhooks, cron schedules, API calls, or message events. Not sit running in a loop waiting for input

Flexible execution time — Some agent tasks take 5 seconds, others take 5 minutes. The platform should accommodate both without artificial time limits

Isolated, secure runtimes — Each agent runs in its own container with scoped credentials, network policies, and resource limits. Compromising one agent doesn't compromise others

Per-invocation billing — Pay for the time your agent is actually processing, not the time it's waiting for work

Built-in observability — Structured logs for every tool call, API request, and decision. Token usage tracking. Invocation history. Without bolting on a separate monitoring stack

Agent-Native Infrastructure

This is the problem Maritime solves. Instead of adapting web infrastructure for agents, we built infrastructure specifically for the agent execution pattern.

When your agent is idle, it's checkpointed — full memory state serialized and stored. When a trigger fires (webhook, cron, API call), the checkpoint is restored in under 2 seconds. Your agent picks up exactly where it left off, processes the request, and goes back to sleep.

No Kubernetes. No cold starts. No execution time limits. No always-on billing.

maritime deploy ./my-agent

Your agent gets a dedicated endpoint, encrypted secrets management, structured logging, and trigger configuration. The platform handles the lifecycle. You write agent logic.

The Infrastructure Shift

Every major computing paradigm has eventually gotten infrastructure built specifically for it. Web apps got Heroku. Microservices got Kubernetes. Serverless functions got Lambda.

AI agents are the next paradigm shift, and they deserve infrastructure that understands their execution pattern — not a repurposed web server with a longer timeout.

The teams building production agents today are spending 40-60% of their time on infrastructure. That's not a DevOps problem. It's an infrastructure problem. And it's solvable.