Architecture

A 62-hour stress test of sleep/wake

One agent, one Firecracker host, 63 cron-driven wakes over 2.6 days. Zero errors. Sub-second restores from start to finish.

Maritime Team·May 24, 20265 min read

Short benchmarks tell you a system can do something once. They don't tell you what happens on the 30th cycle, or the 60th, with the host under real load and the snapshot file aging on disk. So we let one agent run unattended for 62 hours and read the logs after.

What sleep means on Firecracker

When a Maritime agent on Firecracker goes idle, three things happen:

Firecracker serializes the live VM (guest RAM, vCPU registers, virtio device state) into a snapshot file on local NVMe.
The Firecracker process exits. The kernel reclaims the memory. The host has no allocation for this agent anymore.
The snapshot file sits on disk. A few hundred megabytes.

When the next trigger fires, the reverse: read the snapshot, spawn a fresh Firecracker process, restore. The guest resumes at the instruction it was paused on. Open sockets stay open. Page tables stay valid. The agent does not know it stopped existing for an hour.

This is materially different from how most platforms talk about sleep. A stopped container is not sleep. It still holds an entry in the host's process table, still occupies disk for its image and writable layer, still counts against scheduling decisions. A snapshot is a file. Between cycles there is no agent, only a memfile waiting to be mapped back in.

The run

One agent on a Firecracker host in Germany. Twenty-four cron triggers, one per hour, each carrying a different short prompt. We deployed it on a Friday morning, walked away, and pulled the data 62 hours later.

No restarts, no nudges, no manual sleeps. The full lifecycle was captured by the platform's normal event logging.

What the data says

Metric	Value
Wall-clock span	62.4 hours (2.6 days)
Cron-driven wakes	63
Wakes that completed cleanly	63 / 63
Errors, retries, recoveries	0
Total active compute	60.9 min
Agent running	~1.6% of wall clock
Agent on disk only	~98.4%

Wake-to-wake the agent was indistinguishable. The 63rd restore on day three took 667 ms; the first on day one took 665. No drift, no warmup, no aging: the snapshot is a deterministic input and the restore path doesn't accumulate state.

Across all 63 wakes:

Median restore: 674 ms
Mean: 675 ms
p0 / p100: 570 ms / 736 ms
Stdev: ~35 ms

Restore latency across all 63 wakes, the full observed range

The floor is set by NVMe seek plus the first-read page-cache miss on the snapshot file; the ceiling is whatever scheduling jitter the host happens to have at restore time. Both are tight. There is no long tail.

What this means for cost

Maritime's smart tier is a flat $1 a month per agent, with 1,000 included invocations. This experiment used 63 of them. The same agent, doing the same work, on conventional always-on infrastructure looks very different:

Model	Billed for	~30-day cost
EC2 t3.medium, on 24/7	Every hour the VM exists	~$30
Container on K8s, 1 replica	Every hour the pod is scheduled	~$40
AWS Lambda	Per-invocation compute + cold starts	Not viable (15-min cap, no in-memory state)
Maritime smart tier	Flat subscription, snapshot/restore underneath	$1

A 30-40x reduction over always-on, for the same agent doing the same work, because you stop paying for the 98% of the time the agent has nothing to do.

Why the experiment mattered

It's one thing to prove snapshot/restore works in a five-minute benchmark. It's another to run it for 63 consecutive cycles without touching it. The seams that fail under continuous, hands-off operation are not the ones short tests catch: scheduler drift, snapshot file fragmentation, log back-pressure, the kind of bugs that only emerge somewhere around the 30th cycle and then never go away.

We ran the experiment to find those seams. We didn't find any.

Sixty-three wakes, no drift, no errors, no manual touch. The snapshot path stayed boring for two and a half days, which is exactly what you want from infrastructure that has to keep its promises for a long time.

If you're running agents that sit idle most of the time and pay for them as if they don't, give Maritime a try.