Fuse is 95% cheaper and 10x faster than NFS

47 points by agcat 2 days ago

positisop 2 days ago

Please do not make decisions based on this article. It is a poorly written blog with typos and a lack of technical depth. The blog puts Goofys in the same bucket as JuiceFS and Alluxio.. A local NVMe populated via a high-throughput Object Store will give you the best performance. This blog does not go into the system architecture involved that prohibits static models from being pre-populated or the variations in the "FUSE" choices. I can see why AI startups need large amounts of money when the depth of engineering is this shallow.

ChocolateGod 2 days ago

I feel like the author of the article doesn't actually know what FUSE is and that article is AI generated as the comparison tables smell of LLM hallucination.

If you don't care about acceptable latency, metadata operations, indexing, finding files without already knowing the full path, proper synchronisation between clients, then sure mounting S3 over FUSE is nice, heck I even use it myself, but it's not a replacement for NFS.

You could use S3 object storage with something like JuiceFS/SeaweedFS to make metadata operations acceptably fast (in case of Redis backed JuiceFS, lightning fast), but you're no longer just using object storage and now have a critical database in your infrastructure to maintain.

> Speed: Matching NVMe performance (5-10 GB/s) through kernel bypass and parallelization.

Say wha? Not sure how a userland application is supposed to 1) create a tcp connection to connect to s3 or 2) respond to fopen without going through the kernel.

They're in for a shock when they find out you can do NFS via FUSE too.

SahAssar 2 days ago

(Posting while the title is "Fuse is 95% cheaper and 10x faster than NFS", I'm guessing that will get changed based on the HN rules)

This is not at all about NFS vs FUSE, this is about specific NFS providers vs specific FUSE with some specific object store backends.

FUSE us just a way to have a filesystem not implemented in the kernel. I can have a FUSE driver that implements storage based on rat trained to push a button in reaction to lights turning on, or basically anything else.

NFS is a specific networked filesystem.

matrss 2 days ago

> NFS is a specific networked filesystem.
NFS is a set of protocols for networked filesystems. You can just as well implement an NFS server that "implements storage based on rat trained to push a button in reaction to lights turning on". Some people even argue it is a better way to do it than FUSE, because you get robust clients on most platforms with included caching out of the box. E.g. this is a library for building such a NFS server: https://github.com/xetdata/nfsserve
eklitzke 2 days ago

NFS can be super fast, in a past life I had to work a lot with a large distributed system of NetApp Filers (hundreds of filers located around the globe) and they have a lot of fancy logic for doing doing locale-aware caching and clustering.
That said, all of the open source NFS implementations are either missing this stuff or you'd have to implement it yourself which would be a lot of work. NetApp Filers are crazy expensive and really annoying to administer. I'm not really surprised that the cloud NFS solutions are all expensive and slow because truly *needing* NFS is a very niche thing (like do you really need `flock(2)` to work in a distributed way).
- throw0101c 2 days ago
  
  > NFS can be super fast
  Modern day NFS also has RDMA transports available with some vendors. Plus perhaps have it over IB for extra speed.
  - eklitzke 2 days ago
    
    Yeah if you were really trying to make things fast you'd have the compute and NFS server in the same rack connected this way. But you aren't going to get this from any cloud providers.
    For read-only data (the original model is about serving file weights) you can also use iscsi. This is how packages/binaries are served to nearly all borg hosts at Google (most Borg hosts don't have any local disk whatsoever, when they need to run a given binary they mount the software image using iscsi and then I believe mlock nearly all of the elf sections).

looperhacks 2 days ago

This article is a random collection of claims without sources or even explanations how the author came to the conclusions.

- NFS has the "pro" of being POSIX compliant, but I can't see how a FUSE device is different in this regard - FUSE allegedly supports local caching and lazy loading, but why can't I cache or lazy load with a NFS share? - NFS apparently has a high infrastructure costs - but FUSE comes for free? Then, the author compares cloud offerings, which should make the infrastructure concerns moot? - the cost calculations don't even mention which provider is used (though you can guess which one) and seemingly doesn't include transfer costs

There's even more I can't be bothered to mention. Stay away from this post

positisop 2 days ago

NFS is its own spec, that is somewhat compliant with POSIX, and arguably FUSE is POSIX and can be used to implement a POSIX-compliant filesystem.

Spivak 2 days ago

I'm no NFS stan but lordy the comparison table is a hit piece. NFS isn't that bad to administer, there are managed NFS services on every major cloud provider, and for on-prem every RHCE ought to know how to set up and deploy a many-reader multi-writer replicated cluster.

dekhn 2 days ago

This article is garbage on so many levels it's actually impressive.

tomasGiden 2 days ago

I did some benchmarking on BlobFuse2 vs NFS vs azcopy on Azure for a CT imaging reconstruction a year back or so. As I remember it, it was not clear if Fuse (copy on demand) or azcopy (copy all necessary data before starting the workload) was the winner. The use case and specific application access pattern really mattered A LOT: * Reading full files favored azcopy (even if reading parts just when they were needed). * If the application closed and opened each file multiple times it favored azcopy. * If only a small part of many files were read, it favored fuse

Also, the 3rd party library we were calling to do the reconstruction had a limit in the number of threads reading in parallell when preloading projection image data (optimized for what was reasonable on local storage) so that favored azcopy.

Don’t remember that NFS ever came out ahead.

So, benchmark, benchmark, benchmark and see what possibilities you have in adapting the preloading behavior before choosing.

nickaggarwal 2 days ago

With Fuse you can make it transparent for the Application, it just exposes the mount with all the files. When your application reads them, it's pulled from Object storage, while az-copy is a utility to copy it to your disk

c0l0 2 days ago

I've been in this business for a while now, and I continue to be surprised by the extent of how cloud customers are being milked by cloud platform providers. And, of course, their seemingly limitless tolerance for it.

Spooky23 2 days ago

It is amazing. I just left a discussion where the protagonist is moving a legacy workload to a hyperscaler to avoid some software licensing costs. Re-implemented with cloud in mind, it would probably run $10-15k/year to run. As it stands as a lift and shift, likely something like $250k. The total value of the software licensing is <$30k.
Math isn't mathing, but the salesperson implanted the idea. lol
nickaggarwal 2 days ago

I agree, if you go with the wrong solutions, it can inflate the costs

GauntletWizard 2 days ago

Anyone who knows filesystems would have said "No Duh". Caching on NVME will always be significantly faster than remote, simply because of network latency hops - Even at microseconds per! There's really not a huge difference between modern PCI-E Architecture and modern networking - but the length of the cables matters a lot at these latencies.

All that said - There's still a ton of room for NFS to be the backing store, but more importantly there's room for distributed filesystems with intelligent caching to displace all of this.

dmoy 2 days ago

Would be interested to see a comparison with other not-NFS things (Lustre, daos, etc).

User space filesystem is not the first thing that comes to my mind when trying to get faster performance than NFS

fh973 2 days ago

At Quobyte (https://www.quobyte.com) we use FUSE for the client for parallel file system access.
You can get dozens of GB/s out of FUSE nowadays. This will even improve in the near future as FUSE is adding io_uring support for communication with the kernel (instead of a pipe).
- dmoy 2 days ago
  
  Yea I don't doubt that FUSE can go past NFS
  But, as sibling is pointing out, e.g. Amazon fsx Lustre throughput is like 1GB/s throughout per TB storage, so presumably hundreds of GB/s at scale.
nickaggarwal 2 days ago

Yes there is AWS FSx with lustre in the blog..that might be worth checking out
agcat 2 days ago

This is a great idea
- bayindirh 2 days ago
  
  You could also try JuiceFS & Weka (if you can have access to a cluster).
  A well configured and distributed Lustre will be very fast, BTW.
  JuiceFS: https://juicefs.com/en/
  Weka: https://www.weka.io/
  - nickaggarwal 2 days ago
    
    Will test out Weka, Thanks for sharing

Mave83 2 days ago

For AI, you want DAOS storage. It runs in userspace, you can use FUSE and it's the fastest storage on the planet when it comes to bandwidth (see io500). There are good companies supporting it like croit, and with their software it's easy to manage as well.

MertsA 2 days ago

While DAOS looks cool, from their roadmap it looks like they still don't have a fault domain larger than a server... Their erasure coding profiles also look pretty thin. I'm ex-Meta, our infra had vastly different availability and reliability requirements but that looks like it'd be painful to support at scale.

krupan 2 days ago

I'm confused, is this FUSE as in Filesystem in User space?

sudobash1 2 days ago

The title is confusing since FUSE is not a network filesystem. It can be used as a "frontend" for multiple different network filesystems (as used in sshfs and smbfuse). There is even a fuse-nfs project to allow you to run a NFS client using fuse.
But if you scroll down, the article lists a few specific network filesystems using FUSE that were tested (JuiceFS, goofys, etc...).
I don't follow all of the reasoning, but I am not surprised at the conclusion. The newer FUSE-based network filesystems are build for modern cloud purposes, so they are more specific for the task.
agcat 2 days ago

Yes

gjvc 2 days ago

horseshit

pstuart 2 days ago

Is there really a need for a filesystem? Just pull from a bucket and it's done. Push updates to the bucket and...it's done.

I see the need for "sharing" in giving access to the data, but not to have it represented on the filesystem (other than giving the illusion of local dev)

jacobsenscott 2 days ago

What if all your code is already written to use a filesystem, and you want to change the backing store from nfs to object store? Or what if you want to abstract away the specific blob store?
- nickaggarwal 2 days ago
  
  This was our exact use case backing store from nfs to object store. You can cloud cloud-specific mount providers and use a thin client in the middle
bayindirh 2 days ago

When you want to feed your GPUs with what they need (model, weights, data), that kind of hoops slow you down exponentially. You need to be able to stream data to pinned memory, or to the GPU directly (hence GPUDirect) to keep your cards saturated.
This is why systems like Weka exist, and why Lustre is still being developed and polished. These systems reach tremendous speeds. This is not an exaggeration when you connect terabits of bandwidth to these storage systems.
- topspin 2 days ago
  
  I haven't had to feed a building full of GPUs, but I do wonder: does Ceph ever appear on your radar for this sort of work? And if not, why? Seems like it would fit rather well. Yet it never seems to get mentioned.
nickaggarwal 2 days ago

If you want de couple model loading and distribution, If you do it in application when you need application pulling from bucket can be slow