News: 1775735539

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

Amazon put a filesystem on S3; I showed up with a test suite and bad intentions

(2026/04/09)


I've spent over a decade telling anyone who'd listen that S3 is not a filesystem, which in retrospect was a really weird way to start some conversations. So when AWS [1]launched S3 Files on Tuesday – which lets you mount an S3 bucket as an NFS share – I did what any reasonable person would do: I spun up an EC2 instance and started trying to break it.

I had about four hours before getting on the phone with Andy Warfield, the VP/Distinguished Engineer/preternaturally patient man who leads S3's engineering whether he admits it or not, and a subset of the S3 team. I wanted to show up with data, not opinions. Opinions are cheap. Opinions about storage are dangerous. My opinions about storage are hilarious.

The good news: the core product is solid. I threw ten deliberate conflicts at it – writing to the same key from the NFS mount and the S3 API simultaneously – and S3 won every single one, converging in under two seconds with zero split-brain states. For anyone who's had the misfortune of relying on community FUSE drivers like s3fs-fuse or goofys, where "conflict resolution" historically meant "data corruption or a shrug," this is genuinely good engineering.

[2]

It's built on EFS infrastructure, charges the same rates ($0.30/GB storage, $0.03/GB reads, $0.06/GB writes), and the pricing match is deliberate. "It would be unusual to have more favorable economics on one versus the other," the team told me. The trick is that you only pay those rates on the small, hot fraction of data that actually lands on the filesystem. Everything else stays in S3 at $0.023/GB. Mount a petabyte bucket, actively use a terabyte of it, pay accordingly.

[3]

[4]

The team spent months trying to make the boundary between files and objects invisible before realizing the boundary itself was the right design. "We were building to the lowest common denominator across the two," one of the engineers told me. Filesystem clients mutating objects every 10 milliseconds? That's normal for NFS. It's terrifying for an S3 bucket. So they kept the two worlds separate with automatic syncing between them. S3 stays the authoritative data store. The filesystem is a view, not a copy.

Three speeds, one product

I measured three very different sync speeds, which tells you something about the architecture. Writes from the filesystem aggregate over a fixed 60-second window before committing to S3 as single PUTs. New files created through the S3 API appear on the NFS mount in about 30 seconds. Updates to files the filesystem already knows about propagate in 1.8 seconds – 15 times faster than new file creation.

When I walked the team through these numbers, they confirmed the 60-second window is fixed today, but left the door open to the possibility of this becoming adaptive (if this matters to you, harangue your AWS account team about it). The 30-second figure is just S3 event propagation delay. The 1.8-second update speed is the filesystem invalidating an inode it already has cached, which is a much faster path.

Reads above 128 KB (by default; you can configure this as low as 0. Not that you should. But you definitely could...) bypass the filesystem entirely and stream from S3 for free – no S3 Files charge at all, in a move suspiciously reminiscent of the customer-obsessed Amazonian heyday. The bypass does parallel GETs at about 3 GB/s per client today.

[5]

Then I started getting "creative."

I created ten S3 objects with edge-case key names. Trailing slashes. Double slashes. Path traversal patterns. 256-character path components. Keys named just "." and "..". Emoji. The EICAR string, because why the hell not. I then mounted the bucket and ran ls .

Six of them had vanished. No error on the client, no log entry. They're still in S3 – you just can't see them from the filesystem.

[6]

I was initially mistaken about it, but it turns out that a CloudWatch metric does exist for this: ImportFailures in the AWS/S3/Files namespace, dimensioned by FileSystemId. It fired correctly for all of my incompatible keys. But there's no client-side indication whatsoever – no error from ls, no log on the instance, nothing in the NFS response. You have to know to go looking for a specific CloudWatch metric in a namespace you've never heard of, for a service that just launched. Better instrumentation is on the roadmap, including CloudWatch logs pointing to the exact objects that weren't imported. For now, if you mount a bucket that's accumulated "creative" key names over the years, some of your objects will be invisible and your only signal is a counter in CloudWatch that nobody will think to check.

[7]AWS would prefer to forget March ever happened in its UAE region

[8]AWS giveth with its right hand and breaketh with its left

[9]AWS would rather blame its own engineers than its AI

[10]Amazon's $200 billion capex plan: How I learned to stop worrying and love negative free cash flow

Delete propagation produced a genuinely weird result: files deleted via S3 remained readable on the NFS mount for either 6 seconds or 18 seconds, with nothing in between; a perfectly clean bimodal distribution. "That's actually interesting and I wouldn't have expected that," Warfield said. The team suspects an S3 internal delete notification artifact. Practically, this means you can read valid, complete content from a deleted file for up to 18 seconds. Not great, not catastrophic, but definitely worth knowing about.

There's a sharp edge when S3 wins a conflict on a file you're accessing through an access point. Because I am who I am, I went blundering into this full speed. S3 objects created via the API don't carry POSIX ownership metadata, so the imported file defaults to root:root with mode 0644. If your access point enforces a different UID, you can read the file but can no longer write to it – the permissions don't match. "That's actually not the behavior that I would expect," one of the engineers told me. Neither system is wrong individually; it's the combination that bites. The team is looking at it.

Separately, the docs say conflicting filesystem versions go to a lost+found directory. They do – it's called .s3files-lost+found-<filesystem-id> and it lives at the real filesystem root. If you mount through an access point scoped to a subdirectory, you can't see it. That's how access points work: they restrict your view, but it means your conflict artifacts are invisible from the same mount that created them. The team agreed the docs need to call this out, which may be done by the time you read this.

Mountpoint for S3 – AWS's open source FUSE driver – isn't dead; it's being positioned as a different tool, for a different audience. Mountpoint is for large-file throughput workloads where unsupported operations fail fast by design. S3 Files is for everything that wants a real NFS API. The read bypass technology actually came from lessons learned building Mountpoint, which is a nice bit of engineering lineage.

Ed Naim, AWS's GM of File and Object Storage Services, sketched a more interesting vision than the launch. He sees S3 Files evolving into ephemeral filesystem views for data pipelines – spin up a file view of your S3 data for the duration of a task, do your work, sync specific changes back, tear it down. API-driven sync control instead of the current automatic 60-second push is on the roadmap. That's a meaningfully different product from "mount your bucket as a NAS," and as an old-school sysadmin I'm unreasonably angry about what I see as a new and scary use case.

S3 now does objects, files, tables, vectors, and high-performance computing. I asked what comes next. I ignored everything they said in response, and just wrote down "Database" in my notes, surrounded by doodled hearts. Everything is a database if you hold it wrong.

I asked if I need to stop saying S3 is not a filesystem. "No," he said. "S3 is not a filesystem. But S3 Files gives you a file interface on top of it."

I've been saying "S3 is not a filesystem" for over a decade. Turns out I was right the whole time. AWS just decided to stop fighting it and put a real one in front. ®

Get our [11]Tech Resources



[1] https://aws.amazon.com/blogs/aws/launching-s3-files-making-s3-buckets-accessible-as-file-systems/

[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/storage&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2adfNI5po7URN0TWToWMT9gAAAAs&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0

[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/storage&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44adfNI5po7URN0TWToWMT9gAAAAs&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/storage&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33adfNI5po7URN0TWToWMT9gAAAAs&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/storage&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44adfNI5po7URN0TWToWMT9gAAAAs&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0

[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_onprem/storage&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33adfNI5po7URN0TWToWMT9gAAAAs&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0

[7] https://www.theregister.com/2026/03/26/aws_would_prefer_to_forget/

[8] https://www.theregister.com/2026/03/17/aws_ends_support_postgresql_13_rds/

[9] https://www.theregister.com/2026/02/24/amazon_blame_human_not_ai/

[10] https://www.theregister.com/2026/02/17/amazons_200_billion_capex_plan/

[11] https://whitepapers.theregister.com/



S3 is not a filesystem

Philip Storry

But enough people were trying to use it as one that we stopped trying to hold back the tide.

Come on in, the water's... freezing. It's all those Glaciers nearby...

Re: S3 is not a filesystem

khjohansen

Glaciers?? I just get the faint smell of boiling frog!

In a nutshell

VoiceOfTruth

>> The filesystem is a view, not a copy.

I get that 100%.

Everything is a database if you hold it wrong.

Aladdin Sane

He's out of line, but he's right.

I understood more of that than I expected

Martin an gof

Well written, enjoyable article about a subject I have absolutely no experience of.

I mean, I'm still wondering if "object" storage would be a better way to store the family archive, and I still don't know, let alone have a clue how I could implement it, but nevertheless, find this stuff fascinating.

Keep up the good work.

M.

I always wanted to build this for SMB3.

Jeremy Allison

Obvious next step IMHO if they have the semantics worked out.

Makes it a great competitor for Azure.

A new koan:
If you have some ice cream, I will give it to you.
If you have no ice cream, I will take it away from you.
It is an ice cream koan.