Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One of the main complaints about NFS, one of the original distributed systems - is that client machines hang when the server is unavailable. The problem is that the (Unix) filesystem layer assumes that disks are reliable (spoiler alert: they're not), and NFS stretches disk access across a network.

The concept of a "soft mount" with a timeout was introduced to NFS but it's almost never recommended. This is because client programs have no idea how to handle a timeout from the filesystem. This article shows how every HTTP client has to be configured to handle failures. Imagine if every program that accesses a file, from /bin/cat all the way up, had to have error handling code to deal with timeouts and retries. A sane choice is to wait infinitely if there's nothing more intelligent that you can do.



"NFS server xyz not responding , still trying" is still in my head, despite not using nfs for probably a decade.


Most software expect an error of some sort (e.g. file not found, permissions) and do nothing with it except print it. This would be sane behavior with a NFS timeout. Let the user retry rather than hanging forever.

Consider a startup script that hangs forever. Better it fail than hang. Or ls hanging forever when instead the filesystem could fail the operation after 30s.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: