Bugs Rust won't catch
- collinfunk - 24383 sekunder sedanHi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.
I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].$ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n') $ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; done 2>/dev/null $ echo a > file $ time cp file copy real 0m0.010s user 0m0.002s sys 0m0.003s $ time uu_cp file copy real 0m12.857s user 0m0.064s sys 0m12.702sAlso, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)
[1] https://www.gnu.org/prep/standards/standards.html#Semantics [2] https://github.com/advisories/GHSA-w9vv-q986-vj7x
- wahern - 25007 sekunder sedan> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.
- lionkor - 7931 sekunder sedanI struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils. The whole coreutils rewrite is a terrible idea[1] and clearly being done in the wrong way (without the knowledge gained from the previous software).
If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.
To be clear; I love Rust, I use it for various projects, and it's great. It doesn't save you from bad engineering.
[1]: https://www.joelonsoftware.com/2000/04/06/things-you-should-...
- hombre_fatal - 21943 sekunder sedanOne thing that's hard about rewriting code is that the original code was transformed incrementally over time in response to real world issues only found in production.
The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
TFA is a good list of this exact sort of thing.
Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.
- Joker_vD - 20406 sekunder sedan> The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target.
It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.
[0] https://michael.orlitzky.com/articles/posix_hardlink_heartac...
- tdiff - 11797 sekunder sedanOk if there were some rust guys rewriting coreutils with no experience in linux, but how come Ubuntu accepted it into its mainline?
- alkonaut - 13521 sekunder sedan> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?
- marcosscriven - 9209 sekunder sedanThat’s a great article, and indeed a very good blog. Just spent ages reading lots of their other articles.
Of the bugs mentioned I think the most unforgivable one is the lossy UTF conversion. The mind boggles at that one!
- oconnor663 - 19733 sekunder sedan> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username.
That's kind of horrifying. Is there a reliable list somewhere of all the functions that do that? Is that list considered stable?
- misja111 - 15648 sekunder sedanThe root cause of some of the bugs seems to be the opaque nature of some of the Unix API. E.g.
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.
To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries. This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
- bayindirh - 1952 sekunder sedan> This is the largest cluster of bugs in the audit. It’s also the reason cp, mv, and rm are still GNU in Ubuntu 26.04 LTS. :(
This is what grinds my gears. Why all the hate against GNU?
Honestly, this is why I don't learn Rust, and why I didn't bother to read the rest of the article.
- PunchyHamster - 821 sekunder sedanSeems like typical pattern of
* Let's rewrite thing in X, it is better
* Let's not look at existing code, X is better so writing it from scratch will look nicer
* Whoops, existing code was written like this for a reason
* Whoops, we re-introduce decade+ old problems that original already fixed at some point
- z3t4 - 6460 sekunder sedanTo be fair these are mostly gotchas with Linux and not Rust itself, but I guess the std in Rust could handle some of these issues, in that a std should not allow you to shoot yourself in the foot by default.
- fschuett - 22084 sekunder sedanThanks for the list. I like these lists, so I can put them into a .md file, then launch "one agent per file" on my codebase and see if they can find anything similar to the mentioned CVEs.
Rust won't catch it, but now the agents will.
Edit: https://gist.github.com/fschutt/cc585703d52a9e1da8a06f9ef93c... for anyone who needs copying this
- eb08a167 - 8212 sekunder sedanI'm totally fine with people experimenting and making amateur attempts at what adult people do. After all, that's how we grow. What I'm actually curious about is how the decision-making chain at Ubuntu got so messed up that this made it into production.
- 9fwfj9r - 22103 sekunder sedanSo it's basically failing on - necessary atomicity for filesystem operation - annoying path & string encoding - inertia for historical behaviors
- osmsucks - 10267 sekunder sedanI feel like one of the takeaways here is that Rust protects your code as long as what your code is doing stays predictably in-process. Touching the filesystem is always ripe with runtime failures that your programming language just can't protect you from. (Or maybe it also suggests the `std::fs` API needs to be reworked to make some of these occurrences, if not impossible, at least harder.)
On a separate note: I have a private "coretools" reimplementation in Zig (not aiming to replace anything, just for fun), and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see. However, cross-checking uutils I noticed it does have a bunch of unsafe blocks that call into libc, e.g. https://github.com/uutils/coreutils/blob/77302dbc87bcc7caf87.... Thankfully they're pretty minimal, but every such block can reduce the safety provided by a Rust rewrite.
- jolt42 - 24795 sekunder sedanI wonder if Rust becomes more popular with AI as Rust can help catch what AI misses, but then if that's the case then what about Haskell, or Lean, or?
- einpoklum - 9497 sekunder sedanNote:
TOCTOU means "Time-of-check to time-of-use"
See also: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
- r2vcap - 8190 sekunder sedanJust use Fedora :)
- micheles - 22411 sekunder sedan> uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug. That's the minimum, it is absurd that they did not start from that!
- timcobb - 15887 sekunder sedanThe title of this article should be "Rust can't stop you from not giving a fuck" or "Rust can't give a fuck for you."
---
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
...
[List of bugs a diligent person would be mindful of, unix expert or not]
---
Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.
For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.
Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.
Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.
- immanuwell - 20872 sekunder sedanrust promised you memory safety and delivered - but turns out the filesystem doesn't care about your borrow checker, and these 44 cves are the receipt
- rvz - 22609 sekunder sedanThis is what happens when many people hype about a technology that solves a specific class of vulnerabilities, but it is not designed to prevent the others such as logic errors because of human / AI error.
Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.
In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.
- Analemma_ - 24245 sekunder sedanI know nobody's perfect and I'm not asking for perfection, but these bugs are pretty alarming? It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they are trying to replace. Or at least didn't have any curiosity about why the GNU tools work the way they do. Otherwise they might've wondered about why things operate on bytes and file descriptors instead of strings and paths.
I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.
- slopinthebag - 22375 sekunder sedanI find it interesting how people will criticise Rust for not preventing all bugs, when the alternative languages don't prevent those same bugs nor the bugs rust does catch. If you're comparing Rust to a perfect language that doesn't exist, you should probably also compare your alternative to that perfect language as well right?
I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.
Don't just downvote me, tell me how I'm wrong.
- melodyogonna - 3233 sekunder sedanTL;DR: Rust can't catch logic bugs
- SpectreHat - 21995 sekunder sedan[dead]
- Scarbutt - 24552 sekunder sedan[flagged]
- tokyobreakfast - 23068 sekunder sedan[flagged]
- marsven_422 - 24619 sekunder sedan[dead]
- amelius - 12801 sekunder sedan[flagged]
- jonjon16 - 10961 sekunder sedan[flagged]
Nördnytt! 🤓