Why does C have the best file API
- seba_dos1 - 5979 sekunder sedanmmap is not a C feature, but POSIX. There are C platforms that don't provide mmap, and on those that do you can use mmap from other languages (there's mmap module in the Python's standard library, for example).
- okanat - 1501 sekunder sedanI guess the author didn't use that many other programming languages or OSes. You can do the same even in garbage collected languages like Java and C# and on Windows too.
https://docs.oracle.com/javase/8/docs/api/java/nio/MappedByt...
https://learn.microsoft.com/en-us/dotnet/api/system.io.memor...
https://learn.microsoft.com/en-us/windows/win32/memory/creat...
Memory mapping is very common.
- Dwedit - 4473 sekunder sedanUsing mmap means that you need to be able to handle memory access exceptions when a disk read or write fails. Examples of disk access that fails includes reading from a file on a Wifi network drive, a USB device with a cable that suddenly loses its connection when the cable is jiggled, or even a removable USB drive where all disk reads fail after it sees one bad sector. If you're not prepared to handle a memory access exception when you access the mapped file, don't use mmap.
- Const-me - 4763 sekunder sedanI think C# standard library is better. You can do same unsafe code as in C, SafeBuffer.AcquirePointer method then directly access the memory. Or you can do safer and slightly slower by calling Read or Write methods of MemoryMappedViewAccessor.
All these methods are in the standard library, i.e. they work on all platforms. The C code is specific to POSIX; Windows supports memory mapped files too but the APIs are quite different.
- zahlman - 4286 sekunder sedanAside from what https://news.ycombinator.com/item?id=47210893 said, mmap() is a low-level design that makes it easier to work with files that don't fit in memory and fundamentally represent a single homogeneous array of some structure. But it turns out that files commonly do fit in memory (nowadays you commonly have on the order of ~100x as much disk as memory, but millions of files); and you very often want to read them in order, because that's the easiest way to make sense of them (and tape is not at all the only storage medium historically that had a much easier time with linear access than random access); and you need to parse them because they don't represent any such array.
When I was first taught C formally, they definitely walked us through all the standard FILE* manipulators and didn't mention mmap() at all. And when I first heard about mmap() I couldn't imagine personally having a reason to use it.
- nickelpro - 4264 sekunder sedan> Why does C have the best file API
> Look inside
> Platform APIs
Ok.
I agree platform APIs are better than most generic language APIs at least. I disagree on mmap being the "best".
- chuckadams - 3228 sekunder sedanIt has the best API for the author, that's for sure. One size does not fit all: believe it or not, different files have different uses. One does not mmap a pipe or /dev/urandom.
- castral - 4558 sekunder sedanI think OP and I have very divergent opinions on what makes a file API "best". This may have been the best 30 years ago. The world has moved on.
- ibejoeb - 2370 sekunder sedanThe article only touches on `open` and `close` and doesn't deal with any of the realities of file access. Not a particularly compelling write-up.
- mmastrac - 5660 sekunder sedan"best file API" and the man page for the O_ flags disagree.
- alanfranz - 3800 sekunder sedanWell...
I'm not sure what the author really wants to say. mmap is available in many languages (e.g. Python) on Linux (and many other *nix I suppose). C provides you with raw memory access, so using mmap is sort-of-convenient for this use case.
But if you use Python then, yes, you'll need a bytearray, because Python doesn't give you raw access to such memory - and I'm not sure you'd want to mmap a PyObject anyway?
Then, writing and reading this kind of raw memory can be kind of dangerous and non-portable - I'm not really sure that the pickle analogy even makes sense. I very much suppose (I've never tried) that if you mmap-read malicious data in C, a vulnerability would be _quite_ easy to exploit.
- userbinator - 3988 sekunder sedanIt still works if the file doesn't fit in RAM
No it doesn't. If you have a file that's 2^36 bytes and your address space is only 2^32, it won't work.
On a related digression, I've seen so many cases of programs that could've handled infinitely long input in constant space instead implemented as some form of "read the whole input into memory", which unnecessarily puts a limit on the input length.
- FrankWilhoit - 7431 sekunder sedanA file API is not the same thing as a filesystem API. The holy grail is still a universal but high(-enough)-level filesystem API.
- andersmurphy - 2213 sekunder sedanmmap is nice. But, I find sqlite is a better filesystem API [1]. If you are going to use mmap why not take it further and use LMDB? Both have bindings for most languages.
- srean - 5606 sekunder sedan> However, in other most languages, you have to read() in tiny chunks, parse, process, serialize and finally write() back to the disk. This works, but is verbose and needlessly limited
C has those too and am glad that they do. This is what allows one to do other things while the buffer gets filled, without the need for multithreading.
Yes easier standardized portable async interfaces would have been nice, not sure how well supported they are.
- charcircuit - 4253 sekunder sedanC's API does not include mmap, nor does it contain any API to deal with file paths, nor does it contain any support for opening up a file picker. This paired with C's bad string support results in one of it being one of the worst file APIs.
Also using mmap is not as simple as the article lays out. For example what happens when another process modifies the file and now your processes' mapped memory consists of parts of 2 different versions of the file at the same time. You also need to build a way to know how to grow the mapping if you run out room. You also want to be able to handle failures to read or write. This means you pretty much will need to reimplement a fread and fwrite going back to the approach the author didn't like: "This works, but is verbose and needlessly limited to sequential access." So it turns out "It ends up being just a nicer way to call read() and write()" is only true if you ignore the edge cases.
- koakuma-chan - 4720 sekunder sedanHow do you handle read/write errors with mmap?
- jccx70 - 1227 sekunder sedan[dead]
Nördnytt! 🤓