Hi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.
I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:
$ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')
$ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; done 2>/dev/null
$ echo a > file
$ time cp file copy
real 0m0.010s
user 0m0.002s
sys 0m0.003s
$ time uu_cp file copy
real 0m12.857s
user 0m0.064s
sys 0m12.702s
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].
Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)
No need to apologize at all. Doing it in one cd invocation would fail since the file name is longer than PATH_MAX. In that case passing it to a system call would fail with errno set to ENAMETOOLONG.
You could probably make the loop more efficient, but it works good enough. Also, some shells don't allow you to enter directories that deep entirely. It doesn't work on mksh, for example.
Yes? The quote says "tends to", and you still can cd into that directory, albeit not in a single invocation. Windows has similar limitations [0], it's just that their MAX_PATH is just 260 so it's somewhat more noticeable... and IIRC the hard limit of 32 K for paths in non-negotiable.
Isn’t "cd" a unix syscall , because it changes the process's working directory? There was something written somewhere that it cannot be a unix utility for this very reason, but has to be a shell built-in. The syscall is a "single operation" from the point of a single-threaded process.
Yes, it’s a shell builtin that makes the shell execute a chdir() syscall. Therefore it isn’t subject to argument length limits imposed by the kernel when executing processes. But it is still subject to path length limits imposed by the kernel’s implementation of chdir() itself. While the shell may be a GNU project (bash), the kernel generally is not (unless you are running Hurd), so this isn’t GNU’s fault per se.
However, the shell could theoretically chunk long cd arguments into multiple calls to chdir(), splitting on slashes. I believe this would be fully semantically correct: you are not losing any atomicity guarantees because the kernel doesn’t provide such guarantees in the first place for lookups involving multiple path components. I’m not surprised that bash doesn’t bother implementing this, and I don’t know if I’d call that an “arbitrary limitation” on bash’s part (as opposed to a lack of workaround for another component’s arbitrary limitation). But it would be possible.
Nothing; you just missed some other considerations. For instance, Linux generally follows POSIX. That's what the 2004 version has to say about chdir's errors:
ERRORS
The chdir() function shall fail if:
...
[ENAMETOOLONG]
The length of the path argument exceeds {PATH_MAX} or a pathname component is longer than {NAME_MAX}.
...
The chdir() function may fail if:
...
[ENAMETOOLONG]
As a result of encountering a symbolic link in resolution of the path argument, the length of the substituted pathname string exceeded {PATH_MAX}.
However, the following versions of POSIX moved the "length of the path argument exceeds {PATH_MAX}" into the "optional error" part.
First of all, thank you for presenting a succinct take on this viewpoint from the other side of the fence from where I am at.
So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.)
So here goes:
Question 1:
How come "speed", "performance", race conditions and st_ino keep getting brought up?
Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there.
Question 2:
Does throughput really matter more than latency in everyday application?
Question 3 (explanation first, this time):
The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils.
What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO).
> Does throughput really matter more than latency in everyday application?
In my experience latency and throughput are intrinsically linked unless you have the buffer-space to handle the throughput you want. Which you can't guarantee on all the systems where GNU Coreutils run.
> Does throughput really matter more than latency in everyday application?
IME as a user, hell yes
Getting a video I don't mind if it buffers a moment, but once it starts I need all of that data moving to my player as quickly as possible
OTOH if there's no wait, but the data is restricted (the amount coming to my player is less than the player needs to fully render the images), the video is "unwatchable"
I don't mean to nitpick, but absolute values for both of these matter much less than how much it is compared to "enough". As long as the throughput is enough to prevent the video from stuttering, it doesn't matter if the data is moved to your video player program at 1 GB/s or 1 TB/s. Conversely, you say you don't mind if a video buffers for a moment but I'm willing to bet there's some value of "a moment" where it becomes "too long". Nobody is willing to wait an hour buffering before their video starts.
The perception of speed in using a computer is almost entirely latency driven these days. Compare using `rg` or `git` vs loading up your banking website.
Linux desktop (and the kernel) felt awful for such a long time because everyone was optimizing for server and workstation workloads. Its the reason CachyOS (and before that Linux Zen and.. Licorix?) are a thing.
For good UX, you heavily prioritize latency over throughput. No one cares if copying a file stalls for a moment or takes 2 seconds longer if that ensures no hitches in alt tabbing, scrolling or mouse movement.
When Kon Colivas introduced a scheduler optimized for desktop latency, about 15 years ago, the amount of abuse he got from Linux developers was astonishing, and he ended up quitting for good. I remember compiling it on my laptop and noticing how it made a huge improvement in the useability of X and desktop environment.
This isn't what prioritizing throughput actually looks like in most scenarios.
In the example you gave the amount of read speed the user needs to keep up with a video is meager and greater read speed is meaningless beyond maintaining a small buffer.
You in fact notice more if your process is sometimes starved of CPU IO memory was waiting on swap etc. Conversely you would in most cases not notice near so much if the entire thing got slower even much slower if it's meager resources were quickly available to the thing you are doing right now.
When I download a 25GB game I care about throughput for the download to a certain extent that is probably mainly ISP bound rather than local system bound. I don't care if the download takes 10 or 11 minutes as long as I can still use my system with zero delays meanwhile. And whether it takes 11 minutes of 3 hours depends on my ISP mostly. But being responsive to me while it downloads is local latency bound.
Not necessarily. Most race conditions violate the `A` in ACID, but the finicky thing about atomicity is that N > 1 sequential actions that in and of themselves are atomic violates atomicity. So any atomic store is possible to misuse if you can compose multiple atomic operations on it.
In addition ACID isn't always provided by the floor beneath your programs but by designing the programs on top to uphold it and/or not require it, allowing you to relax the constraints from your lower level interfaces for performance reasons.
Firstly, atomicity and/or thread-safety not composing is where the Consistency and Isolation come in.
The "application layer" always has to enforce its own consistency guarantees. If the lower layers are total garbage, then the system is garbage. And the "speed" of the lower layers can be infinitely fast and it doesn’t matter, if the application has a latency floor. So optimize it all you want.
Coreutils are not only used in interactive contexts. They are the primitives that make up the countless shell scripts which glue systems together. Any edge case will be encountered and the resulting poor performance will impact somebody, somewhere.
Here's a related example of what happens when you change a shell primitive's behavior - even interactively. Back in the 2000s, Linux distributions started adding color output to the ls command via a default "alias ls=/bin/ls --color=auto". You know: make directories blue, symlinks cyan, executables purple; that kind of thing. Somebody thought it would be a nice user experience upgrade.
I was working at a NAS (NFS remote box) vendor in tech support. We frequently got calls from folks who had just switched to Linux from Solaris, or had just moved their home directories from local disk to NFS. They would complain that listing a directory with a lot of files would hang. If it came back at all, it would be in minutes or hours! The fix? "unalias ls". Because calling "/bin/ls" would execute a single READDIR (the NFS RPC), which was 1 round-trip to the server and only a few network packets; but calling "/bin/ls --color=auto" would add a STAT call for every single file in the directory to figure out what color it should be - sequentially, one-by-one, confirming the success of each before the next iteration. If you had 30,000 files with a round-trip time of 1ms that's 30 seconds. If you had millions...well, either you waited for hours or you power-cycled the box. (This was eventually fixed with NFSv3's READDIRPLUS.)
Now I'm sure whomever changed that alias did not intend it, but they caused thousands of people thousands of hours of lost productivity. I was just one guy in one org's tech support group, and I saw at least a dozen such cases, not all of which were lucky enough to land in the queue of somebody who'd already seen the problem.
So I really appreciate GNU coreutils' commitment to sane behavior even at the edges. If you do systems work long enough, you will ride those edges, and a tool which stays steady in your hand - or script - is invaluable.
NFS is more annoying on Linux than just using Samba though, at least for the NAS use case. With Samba on my server I can just browse to it in KDE's file manager Dolphin, and samba configuration is a relatively straight forward ini style file on the server. A pair of ports also need to be opened in the host firewall.
Contrast that with NFS, which last I looked needed several config files, matching account IDs between hosts, mounting as root, and would hang processes if connection was lost. At least I hear rpcbind is gone these days.
I don't think anyone sane uses NFS on Linux either these days. And it is rather funny that the protocol Microsoft invented is what stuck and became practical between Linux hosts.
First thing I have heard about NetApp. Seems to be some enterprise focused company, with more than one product. Not sure which product of theirs you refer to.
Synology, TrueNAS and Proxmox probably also have NFS support I would assume, and they definitely have Samba. Those are more relevant to me personally.
I just run a normal headless Linux distro on my NAS computer, I don't see the point of a specialised NAS distro. It too could have NFS if I wanted it, but it currently has Samba, because it is easier and works better.
So in conclusion, I'm not sure what your point is? Doesn't NetApp support anything except NFS?
For read-only access there could be way better caching, especially for common use cases like listing the contents of a filesystem directory. But stuff like this was excluded on purpose.
NFS is really stupid.
NFS made the assumption that a distributed system with over 100 times the latency of a local system could be treated like a local system in every single way.
I am not sure why this means why "NFS is really stupid" if the user assumes that a distributed file system can be treated just like a local system. That is provides the same interface is what makes NFS extremely useful.
To be even fair-er, it wasn't actually memory unsafety, it was "just" unsoundness, there was a type, that IF you gave it an io reader implementation that was weird, that implementation could see uninit data, or expose uninit data elsewhere, but the only readers actually used were well behaved readers.
Indeed, and it doesn't need to be deprecated, because it's an API explicitly designed to give you low-level control where you need it, and because it is appropriately defined as an `unsafe` function with documented safety invariants that must be manually upheld in order for usage to be memory-safe. The documentation also suggests several other (safe) functions that should be used instead when possible, and provides correct usage examples: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set... .
> and because it is appropriately defined as an `unsafe` function with documented safety invariants that must be manually upheld in order for usage to be memory-safe.
Didn't we learn from c, and the entire raison detre for rust, is that coders cannot be trusted to follow rules like this?
If coders could "(document) safety invariants that must be manually upheld in order for usage to be memory-safe." there's be no need for Rust.
No, this is mistaken. Rust provides `unsafe` functions for operations where memory-safety invariants must be manually upheld, and then forces callers to use `unsafe` blocks in order to call those functions, and then provides tooling for auditing unsafe blocks. Want to keep unsafe code out of your codebase? Then add `#![forbid(unsafe_code)]` to your crate root, and all unsafe code becomes a compiler error. Or you could add a check in your CI that prevents anyone from merging code that touches an unsafe block without sign-off from a senior maintainer. And/or you can add unit tests for any code that uses unsafe blocks and then run those tests under Miri, which will loudly complain if you perform any memory-unsafe operations. And you can add the `undocumented_unsafe_comment` lint in Clippy so that you'll never forget to document an unsafe block. Rust's culture is that unsafe blocks should be reserved for leaf nodes in the call graph, wrapped in safe APIs whose usage does not impose manual invariant management to downstream callers. Internally, those APIs represent a relatively miniscule portion of the codebase upon which all your verification can be focused. So you don't need to "trust" that coders will remember not to call unsafe functions needlessly, because the tooling is there to have your back.
> And how is this feasible for a systems language? Rust becomes too impotent for its main use case if you only use safe rust.
No, this is completely incorrect, and one of the most interesting and surprising results of Rust as an experiment in language design. An enormous proportion of Rust codebases need not have any unsafe code of their own whatsoever, and even those that do tend to have unsafe blocks in an extreme minority of files. Rust's hypothesis that unsafe code can be successfully encapsulated behind safe APIs suitable for the vast majority of uses has been experimentally proven in practice. Ironically, the average unsafe block in practice is a result of needing to call a function written in C, which is a symptom of not yet having enough alternatives written in Rust. I have worked on both freestanding OSes and embedded applications written in Rust--both domains where you would expect copious usage of unsafe--where I estimate less than 5% of the files actually contained unsafe blocks, meaning a 20x reduction in the effort needed to verify them (in Fred Brooks units, that's two silver bullets worth).
> Coders historically cannot be trusted to manually manage memory, unless they're rust coders apparently
Most Rust coders are not manually managing memory on the regular, or doing anything else that requires unsafe code. I'm not exaggerating when I say that it's entirely possible to have spent your entire career writing Rust code without ever having been forced to write an `unsafe` block, in the same way that Java programmers can go their entire career without using JNI.
> By definition, it isn't possible for a tool to reason about unsafe code, otherwise the rust compiler would do it
Of course it is. The Rust compiler reasons about unsafe code all the time. What it can't do is definitely prove many properties of unsafe code, which is why the compiler conservatively requires the annotation. But there are dozens of built-in warnings and Clippy lints that analyze unsafe blocks and attempts to flag issues early. In addition, Miri provides an interpreter in which to run unsafe code which provides dynamic rather than static analysis.
Show me system level rust code that only uses safe then... You can't because its impossible. I doesn't matter that it's a minority of files (!), the simple fact is you can't program systems without using unsafe. Rewrite the c dependencies in rust and the amount of unsafe code increases massively
> Most Rust coders are not manually managing memory on the regular
Another sidestep. If coders in general cannot be trusted to manage memory, why can a rust coder be trusted all of a sudden?
> . But there are dozens of built-in warnings and Clippy lints that analyze unsafe blocks and attempts to flag issues early.
We already had that, it wasn't enough, hence..... rust, remember?
You are missing the forest for the trees here. The goal of that's unsafe isn't to prevent you from writing unsafe code. It's to prevent you from unsafe code by accident. That was always the goal. If you reread the comments through that lens I'm sure they'll make more sense.
I think you’re deliberately being obtuse here, and if you don’t see why, you should probably reflect on your reasoning.
I’ve been using Rust for about 12 years now, and the only times I’ve had to reach for `unsafe` was to do FFI stuff. That’s it. Maybe others might have more unsafe code and for good reasons, but from my perspective, I don’t know wtf you’re talking about.
The issue with C is that every single use of a pointer needs to come with safety invariants (at its most basic: when you a pass a pointer to my function, do I. take ownership of your pointer or not?). You cannot legitimately expect people to be that alert 100% of the time.
Inversely, you can write whole applications in rust without ever touching `unsafe` directly, so that keyword by itself signals the need for attention (both to the programmer and the reviewer or auditor). An unsafe block without a safety comment next to it is a very easy red flag to catch.
>when you a pass a pointer to my function, do I take ownership of your pointer or not?
It's honestly frustrating how prevalent this is in C, and the docs don't even tell you this, and if you guess it does take ownership and make a copy for it and you were wrong, now you just leaked memory, or if you guessed the other way now you have the potential to double-free it, use after free, or have it mutated behind your back.
Rust has never been about outright eliminating unsafe code, it's about encapsulating that unsafe code within a safe externally usable API.
When creating a dynamic sized array type, it's much simpler to reason about its invariants when you assume only its public methods have access to its size and length fields, rather than trust the user to remember to update those fields themselves.
The above is an analogy which is obviously fixed by using opaque accesor functions, but Rust takes it further by encapsulating raw pointer usage itself.
The whole ethos of unsafe Rust is that you encapsulate usages of things like raw pointers and mutable static variables in smaller, more easily verifiable modules rather than having everyone deal with them directly.
Canonical's usage of uutils is likely for marketing. But the codebase itself was developed for fun, as an excuse for people to have a hands-on way to learn Rust back before Rust was even released, with a minor justification as being cross-platform. From the original README in 2013:
Why?
----
Many GNU, linux and other utils are pretty awesome, and obviously some effort has been spent in the past to port them to windows. However those projects are either old, abandonned, hosted on CVS, written in platform-specific C, etc.
Rust provides a good platform-agnostic way of writing systems utils that are easy to compile anywhere, and this is as good a way as any to try and learn it.
These things were caught and basically all of them weren't covered by any test suite (not even GNU coreutils'). It's a bit bold to claim that it's actively worsening it when it's not an LTS.
Isn't this how Kernighan and late Ritchie (K&R) ended up with unix and C?
Honestly, brilliant guys.
When C got its own standards committee they even rejected Ritchie's proposal to add fat pointers to C before it was too late to add them. Instead, we got the C abstract machine.
Thomas Jefferson famously said that "A coreutils rewrite every now and again is a good thing". Or something like that.
When I was a beta tester for System Vr2 Unix, I collected as many bug reports as possible from Usenet (I used the name "the shell answer man". Looking back I conclude that arrogance is generally inversely proportional to age) and sent a patch for each one I could verify. Something like 100 patches.
So if this rust rewrite cleans up some issues, it's a good thing.
At the current moment I would be against it. The language and library is changing too fast. Also, Rust has some other things that make it hard to use for coreutils. For example, Rust programs always call signal (SIGPIPE, SIG_IGN) or equivalent code before main(). There is no stable way to get the longstanding behavior of inheriting the signal action from the parent process [1]. This is quite annoying, but not unique to Rust [2].
I think the concern is that the writing may be on the wall for (the current memory-unsafe version of) Coreutils. Despite the bugs and incompatibilities, Canonical seems to have decided that the memory safety of uutils is worth it. And those two downsides, the bugs and incompatibilities, will likely attenuate quickly, compelling the other distros to follow suit in adopting uutils before long.
So the continued popularity of Coreutils might, I think, depend on Coreutil's near-term publicly announced and actual memory safety strategy. As I suggested in my other comment, there are (somewhat nascent) options for memory safety that do not require a rewrite of the code base. (For linux x86_64 platforms, depending on your requirements, that might include the "fanatically compatible" Fil-C.) And given the high profile of Coreutils, there are likely people willing to work with the Coreutils team to help in the deployment of those memory safety options.
I don't know if you're aware, but there is a demonstration of wget (a fellow "gnu utility", right?) being auto-translated to a memory-safe subset of C++ [1]. Because the translation essentially does a one-for-one substitution of potentially unsafe C elements with safe C++ counterparts that mirror the behavior, the translation should be much less susceptible to the introduction of new bugs and behaviors in the way a rewrite would be.
With a little cleaning-up of the original code, the code translation ends up being fully automatic and so can be used as a build step to produce (slightly slower) memory-safe executables from the original C source.
Filesystem access is mostly treated by users as serialized ACID transactions on "files in directories."
"Managing this resource centrally" is where unix syscalls came from. An OS kernel can be used like a specialized library for ACID transactions on hardware singletons.
People then got fancy with virtual memory, interrupts, signals, time-slicing, re-entrancy, thread-safety, and injectivity.
It doesn’t matter, whether you call the "kernel library" from C, C++, Fortan, BASIC, Golang, bash, Rust, etc.
In the given list of GNU CVEs in the original article, it included a buffer overrun in tail from 2021. So for a fair comparison 2021 is part of the "window of activity" (the year uu_od CVE was published).
When K&R created unix and C there was still the better option of moving changes that were better to have in the "kernel" into the kernel.
Now we have "standards" that even cause headaches between Linux and BSD's.
Linux back-propagates stuff like mmap, io_uring, etc. to where it belongs. In this way it is like the original unix. And deservedly running on most servers out there.
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.
More than that: it seems that Rust stdlib nudges the developer towards using neat APIs at an incorrect level of abstraction, like path-based instead of handle-based file operations. I hope I'm wrong.
Nearly every available filesystem API in Rust's stdlib maps one-to-one with a Unix syscall (see Rust's std::fs module [0] for reference -- for example, the `File` struct is just a wrapper around a file descriptor, and its associated methods are essentially just the syscalls you can perform on file descriptors). The only exceptions are a few helper functions like `read_to_string` or `create_dir_all` that perform slightly higher-level operations.
And, yeah, the Unix syscalls are very prone to mistakes like this. For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle; and so Rust has a `rename` function that takes two paths rather than an associated function on a `File`. Rust exposes path-based APIs where Unix exposes path-based APIs, and file-handle-based APIs where Unix exposes file-handle-based APIs.
So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
> For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle
And then there’s renameat(2) which takes two dirfd… and two paths from there, which mostly has all the same issues rename(2) does (and does not even take flags so even O_NOFOLLOW is not available).
I’m not sure what you’d need to make a safe renameat(), maybe a triplet of (dirfd, filefd, name[1]) from the source, (dirfd, name) from the target, and some sort of flag to indicate whether it is allowed to create, overwrite, or both.
How about fd of the file you wanna rename, dirfd of the directory you want to open it in, and name of the new file? You could then represent a "rename within the same directory" as: dfd = opendir(...); fd = openat(dfd, "a"); rename2(fd, dfd, "b");
I can't think of a case this API doesn't cover, but maybe there is one.
The file may have been renamed or deleted since the fd was opened, and it might have been legitimate and on purpose, but there’s no way to tell what trying to resolve the fd back to a path will give you.
And you need to do that because nothing precludes having multiple entries to the same inode in the same directory, so you need to know specifically what the source direntry is, and a direntry is just a name in the directory file.
> So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
`openat()` and the other `*at()` syscalls are also raw syscalls, which Rust's stdlib chose not to expose. While I can understand that this may not be straight forward for a cross-platform API, I have to disagree with your statement that Rust's stdlib is mistake prone because it's so low-level. It's more mistake prone than POSIX (in some aspects) because it is missing a whole family of low-level syscalls.
They're not missing, Rust just ships them (including openat) as part of the first-party libc crate rather than exposing them directly from libstd. You'll find all the other libc syscalls there as well: https://docs.rs/libc/0.2.186/libc/ . I agree that Rust's stdlib could use some higher-level helper functions to help head off TOCTOU, but it's not as simple as just exposing `openat`, which, in addition to being platform-specific as you say, is also error-prone in its own right.
The parent was asking for access to the C syscall, and C syscalls are unsafe, including in C. You can wrap that syscall in a safe interface if you like, and many have. And to reiterate, I'm all for supporting this pattern in Rust's stdlib itself. But openat itself is a questionable API (I have not yet seen anyone mention that openat2 exists), and if Rust wanted to provide this, it would want to design something distinct.
> Why can I easily use "*at" functions from Python's stdlib, but not Rust's?
I'm not sure you can. The supported pattern appears to involve passing the optional `opener` parameter to `os.open`, but while the example of this shown in the official documentation works on Linux, I just tried it on Windows and it throws a PermissionError exception because AFAIK you can't open directories on Windows.
You can but you have to go through the lower level API: NtCreateFile can open a directory, and you can pass in a RootDirectory handle to following calls to make them handle-relative.
I took parent's message to be asking why the standard library fs primitives don't use `at` functions under the hood, not that they wanted the `at` functions directly exposed.
> why the standard library fs primitives don't use `at` functions under the hood
In this case it wouldn't seem to make sense to use `at` functions to back the standard file opening interface that Rust presents, because it requires different parameters, so a different API would need to be designed. Someone above mentioned that such an API is being considered for inclusion in libstd in this issue: https://github.com/rust-lang/rust/issues/120426
The correct comparison is to rustix, not libc, and rustix is not first-party. And even then the rustix API does not encapsulate the operations into structs the same way std::fs and std::io do.
The correct comparison to someone asking for first-party access to a C syscall is to the first-party crate that provides direct bindings to C syscalls. If you're willing to go further afield to third-party crates, you might as well skip rustix's "POSIX-ish" APIs (to quote their documentation) and go directly to the openat crate, which provides a Rust-style API.
If I have to use unsafe just to open a file, I might as well use C. While Rustix is a happy middle that is usually enough and more popular than the open at crate, libc is in the same family as the "*-sys" crate and, generally speaking, it is not intended for direct use outside other FFI crates.
I agree it is an exaggeration in that of course you could write a wrapper. The point was that if everyone had to write their own FFI wrappers, Rust wouldn't go far and openat is not an exception.
There is code available at the right level of abstraction (the rustix or openat crates), and while it's not managed by the Rust team, uutils already have many third party dependencies. Bringing up libc just because it's first party, instead, is comparing apple to oranges.
There are lots of unstable things in Rust that have been unstable for many years, and the intentional segregating of unstable means that it's a nonstarter for most use cases, like libraries. It's unstable because there's significant enough issues that nobody wants to mark it as stable, no matter what those issues are.
As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere.
There are lots of unstable things in Rust that have been unstable for many years, but this isn't one of them. openat() was added in September, and the next PR in the series implementing unlinkat() and removeat() received a code review three weeks ago and is currently waiting on the author for minor revisions.
> As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere
Agreed. My comment was intended to be read as "it's planned and being worked on", not "it's available".
After reading this article, I'm inclined to think that the right thing for this project to do is write their own library that wraps the Rust stdlib with a file-handle-based API along with one method to get a file handle from a Path; rewrite the code to use that library rather than rust stdlib methods, and then add a lint check that guards against any use of the Rust standard library file methods anywhere outside of that wrapper.
If that's the right approach, then it would be useful to make that library public as a crate, because writing such hardened code is generally useful. Possibly as a step before inclusion in the rust stdlib itself.
Yeah. The idea is, if you're consistently making mistakes because the most convenient API at your disposal (here, the rust standard library file/directory APIs that are based around Paths), then after you fix the actual bugs you should write a better abstraction and then deliberately add friction around not using that better abstraction to try to constrain future developers (including future-you) from using the more-error-prone abstraction.
Parse, don't validate is also a principle that encourages people to use a less-error-prone abstraction (the parsed data structure or an error representing invalid input), rather than a more-error-prone one (the original untyped data with ad-hoc validations at various call sites).
If anything, I find the rust standard library to default to Unix too much for a generic programming language. You need to think very Unixy if you want to program Rust on Windows, unless you're directly importing the Windows crate and foregoing the Rust standard library. If you're writing COBOL style mainframe programs, things become even more forced, though I doubt the overlap between Rust programmers and mainframe programmers that don't use a Unix-like is vanishingly small.
This can also be a pain on microcontrollers sometimes, but there you're free to pretend you're on Unix if you want to.
That's the same for the C or Python standard libraries. The difference is that in C you tend to use the Win32 functions more because they're easily reached for; but Python and Rust are both just as Unixy.
If you want to support file I/O in the standard library, you have to choose _some_ API, and that either is limited to the features common to all platforms, or it covers all features, but call that cannot be supported return errors, or you pick a preferred platform and require all other platforms to try as hard as they can to mimic that.
Almost all languages/standard libraries pick the latter, and many choose UNIX or Linux as the preferred platform, even though its file system API has flaws we’ve known about for decades (example: using file paths too often) or made decisions back in 1970 we probably wouldn’t make today (examples: making file names sequences of bytes; not having a way to encode file types and, because of that, using heuristics to figure out file types. See https://man7.org/linux/man-pages/man1/file.1.html)
You have to choose something, and I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level"). You can usually abstract away platform weirdness at the implementation level, but programming on non-Unix environments it's more like programming against cygwin.
A standard library for files and paths that lacks things like ACLs and locks is weirdly Unixy for a supposedly modern language. Most systems support ACLs now, though Windows uses them a lot more. On the other hand, the lack of file descriptors/handles is weird from all points of view.
Had Windows been an uncommon target, I would've understood this design, but Windows is still the most common PC operating system in the world by a great margin. Not even considering things like "multile filesystem roots" (drive letters) "that happen to not exist on Linux", or "case insensitive paths (Windows/macOS/some Linux systems)" is a mistake for a supposedly generic language, in my opinion.
As far as I can tell from Microsoft's documentation, WinAPI access for ACLs was added in Windows 10, which Rust 1.0 predates. And std::fs attempts to provide both minimalist and cross-platform APIs, which in practice means (for better or worse) it's the lowest common denominator between Windows and Unix, with the objective being that higher-level libraries can leverage it as a building block. From the documentation for std::fs:
"This module contains basic methods to manipulate the contents of the local filesystem. All methods in this module represent cross-platform filesystem operations. Extra platform-specific functionality can be found in the extension traits of std::os::$platform."
Following its recommendation, if we look at std::os::windows::fs we see an extension trait for setting Windows-specific flags for WinAPI-specific flags, like dwDesiredAccess, dwShareMode, dwFlagsAndAttributes. I'm not a Windows dev but AFAICT we want an API to set lpSecurityAttributes. I don't see an option for that in std::os::windows::fs, likely complicated by the fact that it's a pointer, so acquiring a valid value for that parameter is more involved than just constructing a bitfield like for the aforementioned parameters. But if you think this should be simple, then please propose adding it to std::os::windows::fs; the Rust stdlib adds new APIs all the time in response to demand. (In the meantime, comprehensive Windows support is generally provided by the de-facto standard winapi crate, which provides access to the raw syscall).
I'm not sure which docs you mean but that's not true. The NT kernel has used ACLs long before rust was invented. But it's indeed true that rust adds platform-specific methods based on demand. The trouble with ACLs is it means either creating a large API surface in the standard library to handle them or else presenting a simple interface but having to manage raw pointers (likely using a wrapper type but even then it can't be made totally safe).
> the de-facto standard winapi crate, which provides access to the raw syscall
Since the official Microsoft `windows-sys` crate was released many years ago, the winapi crate has been effectively unmaintained (it accepts security patches but that's it).
You misunderstand the documentation. Microsoft doesn't provide online documentation for versions of Windows that are no longer supported. Functions like SetFileSecurity have existed since Windows NT 3.1 back in 1993.
And sure, Rust could add the entire windows crate to the standard library, but my point is that this isn't just Windows functionality: getfacl/setfacl has been with us for decades but I don't know any standard library that tries to include any kind of ACLs.
> I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level")
Can you expound a bit on this? I haven't been able to find any articles related to this kind of problem. It's also a bit surprising, given that Go specifically did not make the same choice as Rust to make strings be Unicode / UTF-8 (Go strings are just arrays of bytes, with one minor exception related to iteration using the range syntax).
Go's docs put it like this: Path names are UTF-8-encoded, unrooted, slash-separated sequences of path elements, like “x/y/z”. If you operate on a path that's a non-UTF-8 string, then Go will do... something to make the string work with UTF-8 when passed back to standard file methods, but it likely won't end up operating on the same file.
Rust has OsStr to represent strings like paths, with a lossy/fallible conversion step instead.
Go's approach is fine for 99% of cases, and you're pretty screwed if your application falls for the 1% issue. Go has a lot of those decisions, often to simplify the standard library for most use cases most people usually run into (like their awful, lossy, incomplete conversion between Unix and Windows when it comes to permissions/read-only flags/etc.).
> Path names are UTF-8-encoded, unrooted, slash-separated sequences of path elements, like “x/y/z”
This is only for the "io/fs" package and its generic filesystem abstractions. The "os" package, which always operates on the real filesystem, doesn't actually specify how paths are encoded, nor does its associated helper package "path/filepath".
In practice, non-UTF-8 already wasn't an issue on Unix-like systems, where file paths are natively just byte sequences. You do need to be aware of this possibility to avoid mangling the paths yourself, though. The real problem was Windows, where paths are actually WTF-16, i.e. UTF-16 with unpaired surrogates. Go has addressed this issue by accepting WTF-8 paths since Go 1.21: https://github.com/golang/go/issues/32334#issuecomment-15500...
So, yes, Go strings are just arrays of bytes in the language, but in the standard library, they’re supposed to be UTF-8 (the documentation isn’t immediately clear on how it handles non-UTF-8 strings).
I think this may be why the OP thinks the Go approach is “every path is a valid UTF-8 string”
Unfortunately, it's not the Rust stdlib, it's nearly every stdlib, if not every one. I remember being disappointed when Go came out that it didn't base the os module on openat and friends, and that was how many years ago now? I wasn't really surprised, the *at functions aren't what people expect and probably people would have been screaming about "how weird" the file APIs were in this hypothetical Go continually up to this very day... but it's still the right thing to do. Almost every language makes it very hard to do the right thing with the wrong this so readily available.
I'm hedging on the "almost" only because there are so many languages made by so many developers and if you're building a language in the 2020s it is probably because you've got some sort of strong opinion, so maybe there's one out there that defaults to *at-style file handling in the standard library because some language developer has the strong opinions about this I do. But I don't know of one.
Openat appeared in Linux in 2006 but not in FreeBSD until 2009; go started being developed in 2007. It probably missed the opportunity by a year. It would have been the right thing to change the os module at some point in the last 18 years, however.
Someone once coined a related term, "disassembler rage". It's the idea that every mistake looks amateur when examined closely enough. Comes from people sitting in a disassembler and raging the high level programmers who had the gall to e.g. use conditionals instead of a switch statement inside a function call a hundred frames deep.
We're looking solely at the few things they got wrong, and not the thousands of correct lines around them.
When I read the article I came away with the impression that shipping bugs this severe in a rewrite of utils used by hundreds of millions of people daily (hourly?) isn’t ok. I don’t think brushing the bad parts off with “most of the code was really good!” is a fair way to look at this.
Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
I think that legitimate real world issues in rust code should be talked about more often. Right now the language enjoys a reputation that is essentiaöly misleading marketing. It isn't possible to create a programing language that doesn't allow bugs to happen (even with formal verification you can still prove correctness based on a wrong set of assumptions). This weird, kind of religious belief that rust leads to magically completely bug free programs needs to be countered and brought in touch with reality IMO.
Is it possible you’ve misunderstood what Rust promises?
> It isn't possible to create a programing language that doesn't allow bugs to happen
Yes, that’s true. No one doubts this. Except you seem to think that Rust promises no bugs at all? I don’t know where you got this impression from, but it is incorrect.
Rust promises that certain kinds of bugs like use-after-free are much, much less likely. It eliminates some kinds of bugs, not all bugs altogether. It’s possible that you’ve read the claim on kinds of bugs, and misinterpreted it as all bugs.
On the other hand, there are too many less-experienced Rust fans who do claim that "Rust" promises this and that any project that does not use Rust is doomed and that any of the existing decades-old software projects should be rewritten in Rust to decrease the chances that they may have bugs.
What is described in TFA is not surprising at all, because it is exactly what has been predicted about this and other similar projects.
Anyone who desires to rewrite in Rust any old project, should certainly do it. It will be at least a good learning experience and whenever an ancient project is rewritten from scratch, the current knowledge should enable the creation of something better than the original.
Nonetheless, the rewriters should never claim that what they have just produced has currently less bugs than the original, because neither they nor Rust can guarantee this, but only a long experience with using the rewritten application.
Such rewritten software packages should remain for years as optional alternatives to the originals. Any aggressive push to substitute the originals immediately is just stupid (and yes, I have seen people trying to promote this).
Moreover, someone who proposes the substitution of something as basic as coreutils, must first present to the world the results of a huge set of correctness tests and performance benchmarks comparing the old package with the new package, before the substitution idea is even put forward.
Where are these rust fans? Are they in the room with us right now?
You’ve constructed a strawman with no basis in reality.
You know what actual Rust fans sound like? They sound like Matthias Endler, who wrote the article we’re discussing. Matthias hosts a popular podcast Rust in Production where talks with people about sharp edges and difficulties they experienced using Rust.
A true Rust advocate like him writes articles titled “Bugs Rust Won’t Catch”.
> Such rewritten software packages should remain for years as optional alternatives to the originals.
> must first present to the world the results of a huge set of correctness tests and performance benchmarks
Yeah, you can see those in https://github.com/uutils/coreutils. This project has also worked with GNU coreutils maintainers to add more tests over time. Check out the graph where the total number of tests increases over time.
> before the substitution idea is even put forward
I partly agree. But notice that these CVEs come from a thorough security audit paid for by Canonical. Canonical is paying for it because they have a plan to substitute in the immediate future.
Without a plan to substitute it’s hard to advocate for funding. Without funding it’s hard to find and fix these issues. With these issues unfixed it’s hard to plan to substitute.
Those Rust fans exist on almost all Internet forums that I have seen, including on HN.
I do not care about what they say, so I have not made a list with links to what they have posted. But even only on HN, I certainly have seen much more than one hundred of such postings, more likely at least several hundreds, even on threads that did not have any close relationship with Rust, so there was no reason to discuss Rust.
Since the shameless promotion with false claims of Java by Sun, during the last years of the previous century, there has not been any other programming language affected by such a hype campaign.
I think that this is sad. Rust has introduced a few valid innovations and it is a decent programming language. Despite this, whenever someone starts mentioning Rust, my first reaction is to distrust whatever is said, until proven otherwise, because I have seen far too many ridiculous claims about Rust.
Could you find one such person on this thread? Someone making ridiculous claims about what Rust offers.
I’ll tell you what I think you’ve seen - there are hundreds of threads where you’ve seen people claim they’ve seen this everywhere. That gives you the impression that it is universal.
The comment you linked says something specific about a specific kind of bug being eliminated - memory safety bugs. And they’re not making a claim, they’re repeating the evidence gathered from the Android codebase. So that’s a fact, memory safety bugs truly did not appear in the Rust parts of Android.
The comment you linked is not claiming Rust code is bug-free. That’s a strawman I’ve seen many, many times. Haters will claim that this happens all the time, but all I see are examples of the haters claiming this. You had to go back 5 months and still couldn’t find anything similar to the strawman.
The only language I've ever seen users make that claim for is Haskell. Rust users have never made the claim, but I've seen it a lot from advocates who appear to find "hello world" a complex hard to write program.
I understand the (narrow) hard guarantees that rust gives. But there there are people in the wider community who think that the guarantees are much, much broader. This is a pretty widespread misconception that should get be rectified.
Nobody believes Rust programs are but free, though. Rust never promised that. It doesn't even promise memory safety, it only promises memory safety if you restrict yourself to safe APIs which simply isn't always possible.
Or... the NSA wants you to think that the NSA wants you to think that the NSA believes that Rust is a memory-safe language, so that everyone who distrusts the NSA keeps using C.
I have never seen a comment claiming that Rust leads to magically completely bug free programs.
Could you please link one? Because I doubt it exists, or if it does, it is probably on some obscure website or downvoted to oblivion.
On the other hand, I see comments in every Rust thread that are basically restatements of yours attacking a strawman.
The reality: Rust does not prevent all bugs. In fact, it doesn't even prevent any bugs. What it actually does is make a certain particularly common and dangerous class of bugs much more difficult to write.
I didn't downvote, but I feel the last two points show a lack of nuance. It's saying "Rust doesn't prevent 100% of the bugs, like all other programming languages", while failing to acknowledge that if a programming language prevents entire classes of bugs, it's a very significant improvement.
Nobody disputes that Rust is one of the programming languages that prevent several classes of frequent bugs, which is a valuable feature when compared with C/C++, even if that is a very low bar.
What many do not accept among the claims of the Rust fans is that rewriting a mature and very big codebase from another language into Rust is likely to reduce the number of bugs of that codebase.
For some buggier codebases, a rewrite in Rust or any other safer language may indeed help, but I agree with the opinion expressed by many other people that in most cases a rewrite from scratch is much more likely to have bugs, regardless in what programming language it is written.
If someone has the time to do it, a rewrite is useful in most cases, but it should be expected that it will take a lot of time after the completion of the project until it will have as few bugs as mature projects.
As other people have mentioned, the goal of uutils was not "let's reduce bugs in coreutils by rewriting it in Rust", it was "it's 2013 and here's a pre-1.0 language that looks neat and claims to be a credible replacement for C, let's test that hypothesis by porting coreutils, giving us an excuse to learn and play with a new language in the process". It seems worth emphasizing that its creation was neither ideologically motivated nor part of some nefarious GPL-erasure scheme, it was just some people hacking on a codebase for fun.
Whether or not it was wise for Canonical to attempt to then take that codebase and uplift it into Ubuntu is a different story altogether, but one that has no bearing on the motivations of the people behind the original port itself.
You can see an alternative approach with the authors of sudo-rs. Rather than porting all of userspace to Rust for fun, they identified a single component of a particularly security-critical nature (sudo), and then further justified their rewrite by removing legacy features, thereby producing an overall simpler tool with less surface area to attack in the first place. It was not "we're going to rewrite sudo in Rust so it has fewer bugs", it was "we're going to rewrite sudo with the goal of having fewer bugs, and as one subcomponent of that, we're going to use Rust". And of course sudo-rs has had fresh bugs of its own, as any rewrite will. But the mere existence of bugs does not invalidate their hypothesis, which is that a conscientious rewrite of a tool can result in fewer bugs overall.
But are the current uutils developers the same as the 2013 developers? At least based on GitHub's graphs, that's not the case (it looks fairly bimodal to me), and so it wouldn't be unreasonable to treat the 2013-era project differently to the 2020-era project. So judging the 2020-era project for its current and ongoing failures does not seem unreasonable.
Similarly, sudo-rs dropping "legacy" features leaves a bad taste in my mind, there are multiple privilege escalation tools that exist (doas being the first that comes to mind), and doing something better and not claiming "sudo" (and rather providing a compat mode ala podman for docker) would to me seem a better long term path than causing more breakage (and as shown by uutils, breakage on "core" utils can very easily lead to security issue).
I personally find uutils lack of care to be concerning because I've been writing (as a very low priority side project) a network utility in rust, and while it not aiming to be a drop in rewrite for anything, I would much rather not attract the same drama.
doas and sudo-rs occupy different niches, specifically doas aims for extreme minimalism and deliberately sacrifices even more compatibility than sudo-rs, which represents a middle ground.
No, once you have an MIT-licensed codebase without a copyright assignment scheme, you no longer have the freedom to relicense it at will. You could attempt to have a mixed-license codebase, which is supported by the GPL, and specify that all new contributions must accept the GPL, but this is tantamount to an incompatible fork of the project from the perspective of any downstream users, and anyone who insists on contributing code under the GPL has the freedom to perform this fork themselves.
This is simply false. You can accept GPL contributions and clearly indicate the names of the contributors as required by MIT. There is no "incompatible fork".
No, GPL and MIT have significantly different compliance requirements. You cannot suddenly begin shipping code with stricter compliance requirements to downstream users without potentially exposing them to legal liability.
> It seems worth emphasizing that its creation was neither ideologically motivated nor part of some nefarious GPL-erasure scheme, it was just some people hacking on a codebase for fun.
What the motivation and intent was in 2013 is not necessarily relevant to what the motivation and intent is now.
It's even less relevant to what the effect is: the goal may be to replace $FOO software with $BAR software, but as things stand right now $FOO is "GPL" and $BAR is "MIT".
So, yeah, I don't want them to succeed at their primary goal, because that replaces pro-user software with pro-business software.
Because the bugs were caused by programmer error, not anything inherent to rust. It was more notable due to cloudflare being a critical dependency for half the internet, but that particular issue could've happened in any language.
This kind of melodramatic reaction to rust code is fatiguing, honestly. Rust does not bill itself as some programming panacea or as a bug free language, and neither do any of the people I know using it. That's a strawman that just won't go away.
Rust applies constraints regarding memory use and that nearly eliminates a class of bugs, provided safe usage. And that's compelling to enough people that it warrants migration from other languages that don't focus on memory safety. Bugs introduced during a rewrite aren't notable. It happens, they get fixed, life moves on.
> caused by programmer error, not anything inherent to Rust
Your argument does not work as a praise for Rust because the bugs in any program are caused by programmer errors, except the very rare cases when there are bugs in the compiler tool chain, which are caused by errors of other programmers.
The bugs in a C or C++ program are also caused by programmer errors, they are not inherent to C/C++. It is rather trivial to write C/C++ carefully, in order to make impossible any access outside bounds, numeric overflow, use-after-free, etc.
The problem is that many programmers are careless, especially when they might be pressed by tight time schedules, so they make some of these mistakes. For the mass production of software, it is good to use more strict programming languages, including Rust, where the compiler catches as many errors as possible, instead of relying on better programmers.
The cloudflare bug was the equivalent of an uncaught exception caused by a malformed config file. There's no recovery from a malformed config file - the software couldn't possibly have done its job. What's salient is that they were using an alternative to exceptions, because people were told exceptions were error-prone, and using this thing instead would make it easier to write bug-free code. But don't do the equivalent of not catching them!
And then, it turned out to not really be any better than exceptions.
Most Rust evangelism is like this. "In Rust you do X and this makes your code have fewer bugs!" Well no it doesn't. Manually propagating exceptions still makes the program crash and requires more typing, and doesn't emit a stack trace.
That was why I brought it up. I wasn't trying to be snarky or haughty. Thank you for filling in the gaps, I should have done that instead of the 1-liner.
The "elimination of bugs" is not synonymous with "the elimination of all bugs". The way you're presenting it, any single bug in a rewrite would be grounds to consider the the entire endeavor a failure, which is a ridiculous standard.
There are plenty of strong arguments to be made against rewriting something in Rust, but this is a pretty weak one.
Thing is, these tools are so critical that even one error may cause systems to be compromised; rewriting them should never be taken lightly.
(Actually ideally there's formal verification tools that can accurately test for all of the issues found in this review / audit, like the very timing specific path changes, but that's a codebase on its own)
Is formal verification able to find most of these issues? I'm no expert on formal analysis, but I suspect most systems are not able to handle many of these errors. It seems more likely that the system will assume the file doesn't change between two syscalls - which seems to be the majority of issues. Modeling that possibility at least makes the formal system much harder to make.
Seems pretty impressive they rewrote the coreutils in a new language, with so little Unix experience, and managed to do such a good job with very little bugs or vulns. I would have expected an order of magnitude more at least.
Shows how good Rust is, that even inexperienced Unix devs can write stuff like this and make almost no mistakes.
Yes, it's the lack of Unix experience that's terrifying. So many of mistakes listed are rookie mistakes, like not propagating the most severe errors, or the `kill -1` thing. Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
> Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
From what I understand, "assigned" probably isn't the best way to put it. uutils started off back in 2013 as a way to learn Rust [0] way before the present kerfuffle.
Yeah perhaps learning UNIX API's and Rust at the same time doesn't lead to a drop in replacement ready to be shipped in major distributions. Who whould have thunk it.
Strictly speaking it doesn't preclude eventually producing a production-ready drop-in replacement either, though evidently that needs a fresh set of eyes.
Why is it even possible to represent a negative PID, let alone treat the integer -1 as a PID meaning "all effective processes"? This seems like a mistake (if not a rookie mistake) in the Linux kernel API itself.
Pretty much all the rough edges being discussed here are design mistakes in Linux or Unix, and/or a consequence of using an unsafe language with limited abstractions and a weak type system. But because of ubiquity, this is everyone’s problem now.
You are right, but those who set for themselves the goal to substitute a Linux/UNIX package must implement programs that handle correctly all the quirks of the existing Linux/POSIX specifications.
If they do not like the design mistakes, great, they should set for themselves the goal to write a new operating system together with all base applications, where all these mistakes are corrected.
As long as they have not chosen the second goal, but the first, they are constrained by the existing interfaces and they must use them correctly, no matter how inconvenient that may be.
Anyone who learns English may be frustrated by many design mistakes of English, but they must still use English as it is spoken by the natives, otherwise they will not be understood.
-1 is a special case, a way to represent a PID with all bits set in a platform-independent way. It's not very clean, and it comes from ancient times when writing some extra code and storing an extra few bytes was way more expensive.
The problem is that -DIGIT doubles as both "signal number" and process group. The right way to invoke kill for a process group however would be "kill [OPTS]... -- -PGID".
Not necessarily, but was the reasoning sound and have the tradeoffs been made? The website (https://uutils.github.io/) shows some reasonable "why"s (although I disagree with making "Rust is more appealing" a compelling reason, but that's just me (disclaimer: I don't like C and don't know Rust so take this comment as you will)), but I think what's missing is how they will ensure both compatibility and security / edge case handling, which requires deep knowledge and experience in the original code and "tribal knowledge" of deep *nix internals.
Yes, perfectly good code can have bugs. This is ridiculous thinking to scrap a codebase because it's not bug-free, to replace it with one riddled with differences in behavior that break everything that uses it.
Understandable as GNU was founded on software freedom. I guess one could argue that the Rust rewrite is to establish some kind of higher standard for correctness.
That depends on what tests you are running. In any significant projects you need a test suite so large that you wouldn't run all the tests before pushing to CI - instead you are the targeted tests that test the area of code you changed, but there are more "integration tests" that go through you code and thus could break, but you don't actually run.
You can also run some static analysis that is too long to run locally every time, but once in a while it will point out "this code pattern is legal buy is almost always a bug"
It is also possible to do some formal analysis of code on CI that you wouldn't always run locally - I'm not an expert on these.
That's true in general. In this case where the logic bugs are from not understanding the API being implemented (and in any similar case), tests wouldn't catch the bugs either (even integration tests) because good tests require understanding the contract of the unit being tested.
And writing comprehensive tests for this behaviour is very difficult regardless of which language you are using.
I am all for rust rewrites of things. But in this case, these are mistakes which were encouraged by the lazy design of `std::fs` and the developers' lack of relevant experience.
And to clarify, I don't blame the developers for lacking the relevant experience. Working on such a project is precisely the right place to learn stuff like this.
I think it's an absurdly dumb move by Canonical to take this project and beta-test it on normal users' machines though…
Reading that Canonical thread was jaw-dropping. Paraphrased: "Rust is more secure, security is our priority, therefore deploying this full-rewrite of core utils is an emergency. If things break that's fine, we'll fix it :)".
I would not want to run any code on my machines made by people who think like this. And I'm pro-Rust. Rust is only "more secure" all else being equal. But all else is not equal.
A rewrite necessarily has orders of magnitude more bugs and vulnerabilities than a decades-old well-maintained codebase, so the security argument was only valid for a long-term transition, not a rushed one. And the people downplaying user impact post-rollout, arguing that "this is how we'll surface bugs", and "the old coreutils didn't have proper test cases anyway" are so irresponsible. Users are not lab rats. Maintainers have a moral responsibility to not harm users' systems' reliability (I know that's a minority opinion these days). Their reasoning was flawed, and their values were wrong.
If you don't want Canonical's packages, you should probably just be using Debian rather than Ubuntu. It's not 2008 anymore, stock Debian is quite user-friendly.
Worth noting is that in Debian experimental coreutils defaults to coreutils-from-uutils [0]. This came as a big surprise and as far as I can tell there's been no discussion. A Canonical developer seems to have unilaterally overwritten the coreutils package without discussing with the maintainer. All the package renames that are in Ubuntu aren't in Debian so you can't switch to GNU utils either without deep trickery in a separate recovery environment.
I'm used to running experimental software but I wasn't ready for my computer to not boot one day because of uutils. The `-Z` flag for `cp` wasn't implemented in the 9 month old version shipped in Debian at that time so initramfs creation failed...
It's in experimental only, not unstable or testing. That said I'm surprised it hasn't even propmpted discussion on debian-devel (sans [0]). I would've thought that at least enough Debian developers run experimental to have noticed and raise the issue, but no. I thought about starting a thread myself but couldn't be bothered.
There aren't true 1:1 clones, but there's ripgrep (inspired by GNU grep) and fd (inspired by GNU find). Those two I like, though. I think they're thoughtfully designed and in ripgrep's case at least (I just haven't read posts/comments by fd's author), it was developed with some close study of other grep implementations. I still use GNU grep and GNU find as well, but rg and fd are often nice for me.
This leaves such a bad taste in my mouth. If you fucking found 44 CVEs with some relatively amateurish ones (I'm no security engineer but even I've done that exact TOCTOU mitigation before) in such a core component of your system a month before 26.04 LTS release (or a couple months if you count from their round 1), surely the response should be "we need to delay this to 28.04 LTS to give it time to mature", not "we'll ship this thing in LTS anyway but leave out the most obviously problematic parts"?
The snap BS wasn't enough to move me since I was largely unaffected once stripping it out, but this might finally convince me to ditch.
It's insane that this is going into an LTS. It's the kind of experiment I'd expect them to play with in a non-LTS and revert in LTSes until it's fully usable, like they did with Wayland being the default, which started in 2017
I’ve gotta agree. Some horror stories were going around about their interview process. It seemed highly optimized to select people willing to put up with insane top-down BS.
Having panics in these are pretty amateur hour even just on a Rust level. I could see if they were like alloc errors which you can't handle, but expect and unwraps are inexcusable unless you are very carefully guarding them with invariants that prevent that code path from ever running.
One thing that's hard about rewriting code is that the original code was transformed incrementally over time in response to real world issues only found in production.
The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
TFA is a good list of this exact sort of thing.
Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.
> we cannot accept any changes based on the GNU source code [..]. It is however possible to look at other implementations under a BSD or MIT license like Apple's implementation or OpenBSD.
The wording of that clearly implies that you should not look at GNU source code in order to contribute to uutils.
good example from the article: the chroot+nss CVE. the rule that nss is dynamic and dlopens libraries from inside the chroot isn't anywhere obvious. it's encoded in 25+ years of sysadmins finding it out. clean-room rewrites end up re-learning that, usually as new CVEs. and LLM ports of the same code inherit the problem: the function signature is what they read, but the scars are what they need.
> The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
It should be stressed that failure to document such lessons, or at least the bugs/vulnerabilities avoided, is poor practice. Of course one can't document the bugs/vulnerabilities one has avoided implicitly by writing decent code to begin with, but it is important to share these lessons with the future reader, even if that means "wasting" time and space on a bunch of documentation such as "In here we do foo instead of bar because when we did bar in conditions ABC then baz happens which is bad because XYZ."
I struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils. The whole coreutils rewrite is a terrible idea[1] and clearly being done in the wrong way (without the knowledge gained from the previous software).
If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.
To be clear; I love Rust, I use it for various projects, and it's great. It doesn't save you from bad engineering.
> I struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils.
> If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.
Interestingly, the uutils project uses the GNU coreutils test suite.
EDITED to add: they also have a stated position of not allowing contributions based on reading the GPL'd source.
welcome new systems programmers: unix is broken and you must write ugly non-pedagogical workarounds and do empirical testing. this is what reliable software and good software engineering actually is... surprise!@#%
> The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target.
It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.
To the extent that locking exists in posix it is various degrees of useless and broken. And as far as I know while BSDs have extensions which make some use cases workable Linux is completely hopeless.
The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
E.g.
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.
To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries.
This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
> The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
Seems and smells is weasel words. The root cause is not thinking: Why is root chrooting into a directory they do not control?
Whatever you chroot into is under control of whoever made that chroot, and if you cannot understand this you have no business using chroot()
> To me such a get_user_by_name function is like a booby trap
> I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
You'd probably still be in the trap: there's usually very little difference between writing to newroot/etc/passwd and newroot/usr/lib/x86_64-linux-gnu/libnss_compat.so or newroot/bin/sh or anything else.
So I think there's no reason for /usr/sbin/chroot look up the user id in the first place (toybox chroot doesn't!), so I think the bug was doing anything at all.
> The root cause is not thinking: Why is root chrooting into a directory they do not control?
Because you can't call chroot(2) unless you're root. And "control a directory" is weasel words; root technically controls everything in one sense of the word. It can also gain full control (in a slightly different sense of the word) over a directory: kill every single process that's owned by the owner of that directory, then don't setuid into that user in this process and in any other process that the root currently executes, or will execute, until you're done with this directory. But that's just not useful for actual use, isn't it?
Secure things should be simple to do, and potentially unsafe things should be possible.
The CVE itself uses the language "If the NEWROOT is writable by an attacker" which could refer to a shared library (as indicated in the report), or even a passwd file as would have been true since the origin of chroot()
> root technically controls everything in one sense of the word.
But not the sense we're talking about.
> Because you can't call chroot(2) unless you're root
Well you can[1], but this is /usr/sbin/chroot aka chroot(8) when used with a non-numeric --userspec, and the point is to drop root to a user that root controls with setuid(2). Something needs to map user names to the numeric userids that setuid(2) uses, and that something is typically the NSS database.
Now: Which database should be used to map a username to a userid?
- The one from before the chroot(2)?
- Or the one that you're chroot(2)ing into
If you're the author of the code in-question, you chose the latter, and that is totally obvious to anyone who can read because that's the order the code appears in, but it's also obvious that only the </i>first one* is under control of root, and so only the first one could be correct.
[1]: if you're curious: unshare(CLONE_USERNS|CLONE_FS) can be used. this is part of how rootless containers work.
No, you can't, it's an entirely different syscall that does something vaguely similar. IMHO there are a bit too many root-restricted operations that should not have been; but they are, so we're stuck with setuid-enabled "confused deputies" — arguably, it's the root that should be prohibited from calling chroot(2).
> Now: Which database should be used to map a username to a userid? If you're the author of the code in-question, you chose the latter
That's the problem: the choice is implicit. If the author moved setuid/setgid calls way up in the call order, the implicit choice would've also been the safe one but it was literally impossible.
> unshare(CLONE_USERNS|CLONE_FS) can be used
Wait, CLONE_USERNS? That's not a real flag. Did you mean CLONE_NEWUSER?
> Did you mean CLONE_NEWUSER? [~] it's an entirely different syscall that does something vaguely similar
Yes. And I agree, but it also enables chroot(2) to work without being root, which was the syscall we are talking about, and which I still maintain is not as important as reading.
> arguably, it's the root that should be prohibited from calling chroot(2).
> IMHO there are a bit too many root-restricted operations that should not have been
It's a popular opinion. It's also cheap. So what?
> so we're stuck with setuid-enabled "confused deputies"
chroot(8) is not setuid-enabled. This has nothing to do with anything.
> That's the problem: the choice is implicit. If the author moved setuid/setgid calls way up in the call order, the implicit choice would've also been the safe one but it was literally impossible.
False. The setuid/setgid calls are in the right place. The lookup of the database mapping usernames to userids is in the wrong place.
If the rust programmer just read what they wrote they would see this.
If you just read what they wrote you would see this.
If the attacker can control newroot/etc/passwd they _still_ get getpwnam to return whatever userid they want. The solution is to not lookup --userspec=username:group inside the chrooted-space, but from outside.
> The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
Some, maybe, but if you've decided to rewrite coreutils from scratch, understanding the POSIX APIs is literally your entire job.
And in any case, their test for whether a path was pointing to the fs root was `file == Path::new("/")`. That's not an API problem, the problem is that whoever wrote that is uniquely unqualified to be working on this project.
Interestingly, it looks like the `file == Path::new("/")` bit was basically unchanged from when it was introduced... 12 (!) years ago [0] (though back then it was `filename == "/"`). The change from comparing a filename to a path was part of a change made 8 months ago to handle non-UTF-8 filenames.
> That's not an API problem, the problem is that whoever wrote that is uniquely unqualified to be working on this project.
To be fair, uutils started out with far smaller ambitions. It was originally intended to be a way to learn Rust.
> Some, maybe, but if you've decided to rewrite coreutils from scratch, understanding the POSIX APIs is literally your entire job.
Yes, it is. But still such traps in API just unacceptable. If you design API that requires obscure knowledge to do it right, and if you do it wrong you'll get privilege escalation, it is just... just... I have no words for it. It is beyond stupidity. You are just making sure that your system will get these privilege escalations, and not just once, but multiple times.
Rather, I think that using a functional safe language tricks people into thinking that the data it deals with is stateless. Whereas many many things change in operating systems all the time.
Until we have a filesystem that can present a snapshot, everything has to checked all the time.
i.e. we need an API which gives input -> good result or failure. Not input -> good result or failure or error.
Right? Canonical wanted (still wants?) to use a coreutils implementation where "rm ./" would print "invalid input" while silently deleting the directory anyway.
I don't really care that some very amateur enthusiasts wrote some bad code for fun, but how in the world did anyone who knows anything about linux take this seriously as a coreutils replacement?
I'm totally fine with people experimenting and making amateur attempts at what adult people do. After all, that's how we grow. What I'm actually curious about is how the decision-making chain at Ubuntu got so messed up that this made it into production.
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?
My understanding is the uutils development process involved extensive testing against the behaviour of the original utilities, including preserving bugs.
But we still have CVE's for trivial things? I mean just a medium sized test suite for "rm" alone should probably be many thousand test cases or so. And you'd think that deleting "." and "./" respectively would be among them? Hindsight is always 20/20 and for inputs involving text input you can never be entirely covered, but still....
> So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Yes.
> Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
I think people have been trying that since before I was born and haven't yet been successful, so I am much less sure than you are.
For example: How do you decide how many `/` characters to try?
For a better one: Can you imagine if "rm" could simply decide to refuse to delete files containing "important" as first 9 bytes? How would you think of a test for something like that without knowing the letters in that order? What if the magic word wasn't in a dictionary?
> This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
I've never heard anyone say that except as a straw man.
I've heard people say tests don't do what people think they do.
> Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected ?
This is one reason why Windows disables symlinks by default, and it's not an abstraction but wholesale removal of a feature. Unixes can't do that without breaking decades of software that relies on their existence.
MacOS does something similar, for example the chroot() bug isn't an issue in practice because MacOS forbids chroot() by default (you need to disable system integrity protection).
The fundamental problem is caused by the POSIX APIs. They have sharp edges by their very nature. The "fix" is to remove them.
To be fair these are mostly gotchas with Linux and not Rust itself, but I guess the std in Rust could handle some of these issues, in that a std should not allow you to shoot yourself in the foot by default.
> These are noisy in test code where panicking on bad data is exactly what you want. The cleanest way to scope them to non-test code is to put #![cfg_attr(test, allow(clippy::unwrap_used, clippy::expect_used, clippy::panic, clippy::indexing_slicing, clippy::arithmetic_side_effects))] at the top of each crate root, or to gate #[allow(...)] on the individual #[cfg(test)] modules.
Clippy doesn't even run on unit tests by default. Honestly it doesn't seem very useful to have it do so for ordinary development, but maybe you'd want to run Clippy on your unit tests in CI just to be extra safe, in which case you could encode those allowed lints in the line of your CI config where you run `cargo clippy`, e.g. `cargo clippy -- -A unwrap_used -A expect_used -A panic -A indexing_slicing -A arithmetic_side_effects`, if you really didn't want to have them in the source for whatever reason.
Delaying the run of clippy until CI would be annoying, because then you'd get a build failure for something that was preventable and could have been quickly addresses during development before pushing. Just feels like a pebble in your shoe.
I have to partially disagree with applying Hyrum's law here. In the case of core utils, there's not just the common GNU version. There's also what POSIX says they should do and what the various BSD does, plus some other implementations from various vendors that we mostly forget about. If in any case what this version of Core Utils does is different from what GNU does in a way that others are also different, it would be a good thing to break behavior because anyone's script already is wrong in ways that are going to matter in the real world and it may matter in the future anyway, so breaking them now is good. If your script depends on GNU's behavior, then you shouldn't be calling the standard version. You should be explicitly specifying the GNU version. That is, don't use CP. Use GNU-CP or whatever it is commonly installed at. Or you check for what version of CP you have.
But if you seek to replace coreutils (as at least is the case with Canonical it seems), rather than just be another POSIX userland implementation (e.g. busybox), then I would suggest you do need to be bug-compatible? I can apt/dnf/apk install busybox and use that for my user rather than coreutils, but given a significant amount of Linux infrastructure (including likely many personal scripts) are tied to coreutils, the bar is much higher. Given the numerous issues with quality Canonical has had, not just with Ubuntu but their other "commercial" tooling, I'm not sure any rewrite/port, written in rust or otherwise, with Canonical developing, managing, or even being associated with the project can meet the requisite bar.
As someone who prefers BSD I would make it my goal to become something reasonably popular on linux that isn't different just to force less reliance of the GNUisms in their core utils. Nothing wrong with the GNUisms on the command line, but there are are a lot of GNU assumptions in scripts that should be portable.
Thanks for the list. I like these lists, so I can put them into a .md file, then launch "one agent per file" on my codebase and see if they can find anything similar to the mentioned CVEs.
Most (if not all) of these issues do not matter at all outside the scope GNU utils run in.
For example, using filepaths instead of FDs does not matter in most cases in controlled server environments, or in processes that will never run with elevated privilege (most apps).
> Most (if not all) of these issues do not matter at all outside the scope GNU utils run in.
I suspect that attitude is how we got ourselves into this mess.
You have to assume you ultimately don't control what scope your software runs in. Obviously you do, 99.999% of the time. The other 0.0001% is when someone has found another vulnerability that lets them run your program with elevated privileges in an environment you didn't expect, and then they can use it to exploit one of these bugs. Almost all exploits use a chain of vulnerabilities each one seemingly mostly harmless - your "no one can ever exploit this weakness in my program because I control the environment" will be just one step in the chain.
That sounds far fetched. It is far fetched in the sense that it almost never happens. But nonetheless systems were and are exploited because of it. Once the solution was added in 2006 (openat() and friends), it should have never happened again. And indeed in the GNU utils it can't.
The people who build Rust's std::fs should have been aware of the problem and its solution because it was written in 2015. std::path was written at the same time, and that is where the change has to be made. It's not a big change either: std::path has to translate the path into a OS descriptor use that instead of the path - but only if it was available. I suspect the real issue was they had the same attitude as you, they thought it affects such a small percentage of programs it didn't really matter. That and it's a little bit of extra work.
It was a pity they had that attitude, because the extra work would have avoided this mess.
Nope! But basically, expect anything that resolves usernames, or host names, to be done in the userspace by NSS.
Sun engineers Thomas Maslen and Sanjay Dani were the first to design and implement
the Name Service Switch. They fulfilled Solaris requirements with the nsswitch.conf
file specification and the implementation choice to load database access modules as
dynamically loaded libraries, which Sun was also the first to introduce.
Sun engineers' original design of the configuration file and runtime loading of name
service back-end libraries has withstood the test of time as operating systems have
evolved and new name services are introduced. Over the years, programmers ported the
NSS configuration file with nearly identical implementations to many other operating
systems including FreeBSD, NetBSD, Linux, HP-UX, IRIX and AIX.[citation needed] More
than two decades after the NSS was invented, GNU libc implements it almost identically.
musl has its own approach to this, it's called nscd
It would have avoided the "running code as root" part, but it would still allow an attacker to control the result of the function call.
I mean, the problem being solved here isn't exactly a bad problem to try to solve. You either permanently hard-code `/etc/passwd` as the user database, and `/etc/resolv.conf` as the source of DNS server information, or you allow these to be handled in a more complex way (thus allowing YellowPages, LDAP, or whatever you can imagine).
Obviously, if you tie the ability to handle those things to your filesystem layout, either by loading dynamic libraries from whatever is /usr/lib, or by reading /etc/whatever.conf, or even providing a whole virtual mount à la /proc, chroot'ing gives you both with the ability to override the system-wide policy for yourself (pretty reasonable for DNS lookups, kinda dubious for username lookups) and the opportunity to accidentally pwn yourself.
Frankly, sometimes I feel that on Linux, root should be restricted to executing/loading only a whitelist of executables/shared objects, identified by hash of the contents, not the file paths. But then again, you'll need a allow_for_root(1) utility to maintain this whitelist, and people absolutely will call it in their setup scripts in all kinds of dubious manner.
The "kill -1" is hilarious. I wouldn't use ubuntu for production for quite awhile while things shake out or, probably, never (since i don't use ubuntu).
Unrelated but also in the category of bugs Rust won't catch (natively), there are crates that allow C++ style contracts, or more generally, dependent typing and can be used to catch issues at compile time rather than runtime. I use this one, anodized.
For core system functionality maybe. But for most applications Rust slow compiler iteration speed becomes a bottleneck when the likes of TypeScript (with Bun) and Go have sub second iteration times.
Plus AI is also good at catching, in other languages, errors that Rust tooling enforces. Like race conditions, use after free, buffer overflows, lifetimes, etc.
So maybe AI will become to ultimate "rust checker" for any language.
In my experience developing different types of applications in Rust, the claims of a "slow compiler" are overstated. Sub second iteration times are definitely a thing in Rust as well, unless you're adding a new dependency for the first time or building fresh.
Our experiences clearly differ then. And for others as well since it's a common complain.
Countless time I have seen other people complain as well. There are articles about it even. Can't find the YouTube link now but recently a gamedev abandoned Rust due to compilation speed alone because iteration speed was paramount to their creative process.
Handwaving isn't going to make it any better. And thinking Go/TS compilation speed are comparable to Rust is, a handwave and a half to say the least.
Cargo check and friends are subpar for AI because they actually need to run the thing and unit tests for efficient agentic loops.
A single loop might recompile and rerun the application/unit tests enough times that slow compilers like Rust and Scala become detrimental.
I think you could have left it at differing experiences and not gone further saying I'm handwaving anything. That doesn't seem productive.
I'm not saying that Rust compilation time is comparable to Go/TS, I'm saying the blanket claim that Rust iteration speed will be a bottleneck requires context.
I definitely agree with you that it is a complaint that is often repeated online, but that doesn't make it universally true. In my experience it's a claim that is often echoed without proper context.
Particularly in the case of AI Rust recompliation times in my experience have not been the dominant cost, but are instead overshadowed by inference time, the agent working through different approaches, etc.
The productivity increase I get overall by not having to worry so much about if my rust code will work if it compiles tends to net faster iteration speeds for me. Compile times have never bothered me.
Rust does not hate GNU, and I'm not sure why anyone would have that misconception. It would be like saying that C hates GNU because the BSDs aren't GNU. The fact that there is less GNU-licensed Rust software than MIT-licensed Rust software is attributable to the simple fact that, in general, GNU has been ceding ground to MIT for more than 20 years.
> uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug.
That's the minimum, it is absurd that they did not start from that!
I recall the last time there was a massive bug in the uutils project, it was because the coreutils tests didn't cover some crucial aspect people relied on. Running these tests is useful for compatibility and all, but it won't necessarily catch security issues.
I believe they did it all the time. Maybe it was not automated? But they boasted in news multiple times how many coreutils tests they are passing. I suspect that those tests are useless for security, they are more about compatibility or something like that.
The title of this article should be "Rust can't stop you from not giving a fuck" or "Rust can't give a fuck for you."
---
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
...
[List of bugs a diligent person would be mindful of, unix expert or not]
---
Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.
For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.
Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.
Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.
> Pretty shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure
uutils did not start off as "let's make critical infrastructure in Rust", it started off as "coreutils are small and have tests, so we're rewriting them in Rust for fun". As a result there's needed to be a bunch of cleanup work.
Okay, thanks for the context, but aren't distributions eager to adopt these? Are current GNU coreutils a common vulnerability vector?
> For fun
My idea of fun is reviewing my code and making sure I'm handling errors correctly so that my software doesn't suck. Maybe the people who are doing this, for fun, should be more aligned with that mentality?
I love Rust, but I wonder if this is an example of the idea that its excellent type system can lull some people into a false sense of security. Particularly when interfacing to low-level code like kernel APIs, which are basically minefields inadvertently designed to trick the unwary, the Rust guarantees are undermined. The extent of this may not be immediately obvious to everyone.
This seems to be the case, yes. Before reading this post I was a lot more open minded about the "rewrite it in Rust" scene but now I'm just kind of in a horrorpit wondering whether I'll be stuck on macOS forever :(.
Creative but implausible excuse. MacOS is a better OS for consumers than Windows. But if you're a developer or other technical person, nothing stops you from using Linux today.
Right but coming from macOS, how do I know that the Linux distro I pick doesn't have this god-forsaken stuff in it? Before this thread I didn't know Canonical was so... busted. What else do I not know? With macOS, I think I can be sure that this kind of stuff won't be in the core shell commands :).
When I do `man builtin` on macOS now, I get:
```
HISTORY
The builtin manual page first appeared in FreeBSD 3.4.
```
which is what I expected, and I don't expect those to be pulled out from under me and replaced with the sort of nonsense we have here today.
I don't think that is the case. I think the people that wrote this are simply bad programmers. Some of these issues are so obvious that if you've been doing any amount of programming, you should be able to anticipate them, whether you're writing C, Rust, or Java.
So yeah, their implementation of chmod checked if a path was pointing to the root of the filesystem with 'if file == Path::new("/")'.
How the f** did this sub-amateur slop end up in a big-name linux distribution? We've de-professionalized software engineering to such a degree that people don't even know what baseline competent software looks like anymore
> Rust’s standard library makes this easy to get wrong. The ergonomic APIs you reach for first (fs::metadata, File::create, fs::remove_file, fs::set_permissions) all take a path and re-resolve it every time, rather than taking a file descriptor and operating relative to that. That’s fine for a normal program, but if you’re writing a privileged tool that needs to be secure against local attackers, you have to be careful.
It's not fine even for a normal program, because operations on a large number of files will end up an order of magnitude slower. No matter what language you write your utility in.
... reads the article to the end, marvels at all the problems resulting from not understanding how the OS works and missing 40 years of refinement ...
> That means, even if the tools were (and probably still are) buggy, they never had a bug that could be exploited to read arbitrary memory.
Well, that begs the question, is it worse to read arbitrary memory (which would probably in most cases be prevented by various dynamic protections [0] anyway), or failing to prevent rm -rf /./ and killing every process in the system, etc.?
This is still a good case study of the value of the much-touted rust rewrites. Usually they are performed by people who are domain experts in rust, but (as seen here) lack basic domain knowledge of the tool's environment.
Many cases, including as a last resort as part of shutdown, to try to trigger remaining services into a graceful exit (although these days cgroups help avoid ever being in such a situation).
I know nobody's perfect and I'm not asking for perfection, but these bugs are pretty alarming? It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they are trying to replace. Or at least didn't have any curiosity about why the GNU tools work the way they do. Otherwise they might've wondered about why things operate on bytes and file descriptors instead of strings and paths.
I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.
> It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they were supposed to be replacing.
They're a group of people who want to replace pro-user software (GPL) with pro-business software (MIT).
They are deliberately not looking at coreutils code because the Rust versions are released as MIT and they don't want the project contaminated by GPL. I am not fond of this, personally.
1. uutils as a project started back in 2013 as a way to learn Rust, by no means by knowledgeable developers or in a mature language
2. uutils didn't even have a consideration to become a replacement of GNU Coreutils until.... roughly 2021, I think? 2021 is when they started running compliance/compatibility tests, anyway
3. The choice of licensing (made in 2013) effectively forbids them from looking at the original source
I find it interesting how people will criticise Rust for not preventing all bugs, when the alternative languages don't prevent those same bugs nor the bugs rust does catch. If you're comparing Rust to a perfect language that doesn't exist, you should probably also compare your alternative to that perfect language as well right?
I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.
What's the point of a "rewrite in Rust" when it introduces bugs that either never existed in the original or were fixed already?
> I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime
The point is, those bugs had been discovered and fixed decades ago. Do you want to wait decades for coreutils_rs to reach the same robustness? Why do a rewrite when the alternative is to help improve the original which is starting from a much more solid base?
And even when a complete rewrite would make sense, why not do a careful line-by-line porting of the original code instead of doing a clean-room implementation to at least carry over the bugfixes from the original? And why even use the Rust stdlib at all when it contains footguns that are not acceptable for security-critical code?
Idk, you should ask the maintainers these questions, or the Ubuntu maintainers. I'm not particularly arguing in favour of this rewrite, but the title and contents of the post are talking about Rust in general and the type of bugs it can/can't prevent.
Perhaps one good reason is that once the initial bugs are fixed, over time the number of security issues will be lower than the original? If it could reach the same level of stability and robustness in months or a small number of years, the downsides aren't totally obvious. We will have to wait to judge I suppose. Maybe it's not worth it and that's fine, but it doesn't speak to Rust as a language.
The Rust developers have not read the original coreutils, because they want to replace the GPL license, so they want to be able to say that their code is not derived from the original coreutils.
For a project of this kind, this seems a rather stupid choice and it is enough to make hard to trust the rewritten tools.
Even supposing that replacing the GPL license were an acceptable goal, that would make sense only for a library, not for executable applications. For executable applications it makes sense to not want GPL only when you want to extract parts of them and insert them into other programs.
This allows for the learnings of uutils (and by extension GNU coreutils) to be able to be leveraged by any other project that needs the same functionality. I noticed on a quick scan of the dependents on uucore that other projects (like nushell) do so.
> What's the point of a "rewrite in Rust" when it introduces bugs that either never existed in the original or were fixed already?
Because you are trying to remove memory safety as a source of bugs in the future. No code is bug free, but removing entire categories of bugs from a code base is a good thing.
You’re right, but it’s gonna be hard to stop them from raging. In many ways people want to be justified in a „see, I told you so, Rust is useless” belief, and they’re willing to take one or two questionable logical steps to get there.
"The alternative languages" - in this case you're talking about C, 99% of the time.
So let's talk about that. Well written C code, especially for the purpose of writing and continuing to maintain mature GNU coreutils, is not a big risk in terms of CVE. Between having an inexperienced Rust developer and an extremely experienced C developer (who's been through all the motions), I'd say the latter is likely the safer option.
qmail was at one point the second most widely deployed email server, handling the majority of online mail. It wasn't a research project; it's not obscure. Yahoo used to use it.
And what I mean by track record: After more than a decade after the last published version, a theoretical attack was found requiring special setup uncommon for a sysadmin, and impossible ten years prior.
When anyone thinks about how to build reliable secure software, I think they should be thinking of qmail because it really has no public source-available equal, except maybe djbdns.
seL4 on the other hand makes some specious claims about some ten year old version of itself, and so few people have even heard about it you thought it important to remind it is "technically" C -- qmail isn't like that at all: There is no prover, no test suite, and almost no metaprogramming of any kind. It's just C.
I would recognize sarcasm when I see it. But statistically, that could be true, considering the amount of C code running ( probably far less than COBOL or FORTRAN ), Compared to the relatively small amount of Rust code vs the amount of faults observed with it.
What an incredibly dishonest argument. Obviously "Well written C code" won't be riddled with CVE's by definition, the problem is that since programs written in C are littered with CVE's, it turns out it's really really difficult to write well written C, even for the best developers. With Rust, that entire class of problems is eliminated entirely.
This is what happens when many people hype about a technology that solves a specific class of vulnerabilities, but it is not designed to prevent the others such as logic errors because of human / AI error.
Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.
In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.
I'm not sure that they were all that experienced in Rust when most of this code was written. uutils has been a bit of a "good first rust issue" playground for a lot of its existence
Which makes it pretty unsurprising that the authors also weren't all that well versed in the details of low-level POSIX API
I feel like one of the takeaways here is that Rust protects your code as long as what your code is doing stays predictably in-process. Touching the filesystem is always ripe with runtime failures that your programming language just can't protect you from. (Or maybe it also suggests the `std::fs` API needs to be reworked to make some of these occurrences, if not impossible, at least harder.)
On a separate note: I have a private "coretools" reimplementation in Zig (not aiming to replace anything, just for fun), and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see. However, cross-checking uutils I noticed it does have a bunch of unsafe blocks that call into libc, e.g. https://github.com/uutils/coreutils/blob/77302dbc87bcc7caf87.... Thankfully they're pretty minimal, but every such block can reduce the safety provided by a Rust rewrite.
> and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see.
Probably will depend on what platform(s) you're targeting and/or your appetite for dealing with breakage. You can avoid libc on Linux due to its stable syscall interface, but that's not necessarily an option on other platforms. macOS, for instance, can and does break syscall compatibility and requires you to go through libSystem instead. Go got bit by this [0]. I want to say something similar applies to Windows as well.
This Unix StackExchange answer [1] says that quite a few other kernels don't promise syscall compatibility either, though you might be able to somewhat get away with it in practice for some of them.
Since it's a personal project, Linux compatibility is the only thing I care about right now. I'm testing it under WINE as well, just because I can, but I don't have access to Mac OS so I'm skipping that problem entirely for now
Hi, I am one of the maintainers of GNU Coreutils. Thanks for the article, it covers some interesting topics. In the little Rust that I have used, I have felt that it is far too easy to write TOCTOU races using std::fs. I hope the standard library gets an API similar to openat eventually.
I just want to mention that I disagree with the section titled "Rule: Resolve Paths Before Comparing Them". Generally, it is better to make calls to fstat and compare the st_dev and st_ino. However, that was mentioned in the article. A side effect that seems less often considered is the performance impact. Here is an example in practice:
I know people are very unlikely to do something like that in real life. However, GNU software tends to work very hard to avoid arbitrary limits [1].
Also, the larger point still stands, but the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true [2]. :)
[1] https://www.gnu.org/prep/standards/standards.html#Semantics [2] https://github.com/advisories/GHSA-w9vv-q986-vj7x
Sorry, complete noob here. Why didn't you just cd into $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')? Why do you need to use the while loop for cd?
EDIT: got it. -bash: cd: a/a/a/....../a/a/: File name too long
No need to apologize at all. Doing it in one cd invocation would fail since the file name is longer than PATH_MAX. In that case passing it to a system call would fail with errno set to ENAMETOOLONG.
You could probably make the loop more efficient, but it works good enough. Also, some shells don't allow you to enter directories that deep entirely. It doesn't work on mksh, for example.
Facetious reply:
> However, GNU software tends to work very hard to avoid arbitrary limits [1].
Yes? The quote says "tends to", and you still can cd into that directory, albeit not in a single invocation. Windows has similar limitations [0], it's just that their MAX_PATH is just 260 so it's somewhat more noticeable... and IIRC the hard limit of 32 K for paths in non-negotiable.
[0] https://learn.microsoft.com/en-us/windows/win32/fileio/maxim...
Isn’t "cd" a unix syscall , because it changes the process's working directory? There was something written somewhere that it cannot be a unix utility for this very reason, but has to be a shell built-in. The syscall is a "single operation" from the point of a single-threaded process.
What did I get wrong there?
Side note: Missing
Useful output
Yes, it’s a shell builtin that makes the shell execute a chdir() syscall. Therefore it isn’t subject to argument length limits imposed by the kernel when executing processes. But it is still subject to path length limits imposed by the kernel’s implementation of chdir() itself. While the shell may be a GNU project (bash), the kernel generally is not (unless you are running Hurd), so this isn’t GNU’s fault per se.
However, the shell could theoretically chunk long cd arguments into multiple calls to chdir(), splitting on slashes. I believe this would be fully semantically correct: you are not losing any atomicity guarantees because the kernel doesn’t provide such guarantees in the first place for lookups involving multiple path components. I’m not surprised that bash doesn’t bother implementing this, and I don’t know if I’d call that an “arbitrary limitation” on bash’s part (as opposed to a lack of workaround for another component’s arbitrary limitation). But it would be possible.
> What did I get wrong there?
Nothing; you just missed some other considerations. For instance, Linux generally follows POSIX. That's what the 2004 version has to say about chdir's errors:
However, the following versions of POSIX moved the "length of the path argument exceeds {PATH_MAX}" into the "optional error" part.
Not any longer unless you keep default enabled for backwards compatibility with older Windows software.
It's not a GNU limit. It's in Linux: https://github.com/torvalds/linux/blob/v6.19/include/uapi/li...
First of all, thank you for presenting a succinct take on this viewpoint from the other side of the fence from where I am at.
So how can I learn from this? (Asking very aggressively, especially for Internet writing, to make the contrast unmistakable. And contrast helps with perceiving differences and mistakes.) (You also don’t owe me any of your time or mental bandwidth, whatsoever.)
So here goes:
Question 1:
How come "speed", "performance", race conditions and st_ino keep getting brought up?
Speed (latency), physically writing things out to storage (sequentially, atomically (ACID), all of HDD NVME SSD ODD FDD tape, "haskell monad", event horizons, finite speed of light and information, whatever) as well as race conditions all seem to boil down to the same thing. For reliable systems like accounting the path seems to be ACID or the highway. And "unreliable" systems forget fast enough that computers don’t seem to really make a difference there.
Question 2:
Does throughput really matter more than latency in everyday application?
Question 3 (explanation first, this time):
The focus on inode numbers is at least understandable with regards to the history of C and unix-like operating systems and GNU coreutils.
What about this basic example? Just make a USB thumb drive "work" for storing files (ignoring nand flash decay and USB). Without getting tripped up in libc IO buffering, fflush, kernel buffering (Hurd if you prefer it over Linux or FreeBSD), more than one application running on a multi-core and/or time-sliced system (to really weed out single-core CPUs running only a single user-land binary with blocking IO).
> Does throughput really matter more than latency in everyday application?
In my experience latency and throughput are intrinsically linked unless you have the buffer-space to handle the throughput you want. Which you can't guarantee on all the systems where GNU Coreutils run.
Higher throughput increases the risk of high latency.
Low latency increases the risk of "wasted cycles”, i.e. lowers (machine) throughput. Helps with human discovery throughput, though.
The sled.rs people had a well readable take on this in their performance guide.
> Question 2:
> Does throughput really matter more than latency in everyday application?
IME as a user, hell yes
Getting a video I don't mind if it buffers a moment, but once it starts I need all of that data moving to my player as quickly as possible
OTOH if there's no wait, but the data is restricted (the amount coming to my player is less than the player needs to fully render the images), the video is "unwatchable"
I don't mean to nitpick, but absolute values for both of these matter much less than how much it is compared to "enough". As long as the throughput is enough to prevent the video from stuttering, it doesn't matter if the data is moved to your video player program at 1 GB/s or 1 TB/s. Conversely, you say you don't mind if a video buffers for a moment but I'm willing to bet there's some value of "a moment" where it becomes "too long". Nobody is willing to wait an hour buffering before their video starts.
The perception of speed in using a computer is almost entirely latency driven these days. Compare using `rg` or `git` vs loading up your banking website.
Hell no.
Linux desktop (and the kernel) felt awful for such a long time because everyone was optimizing for server and workstation workloads. Its the reason CachyOS (and before that Linux Zen and.. Licorix?) are a thing.
For good UX, you heavily prioritize latency over throughput. No one cares if copying a file stalls for a moment or takes 2 seconds longer if that ensures no hitches in alt tabbing, scrolling or mouse movement.
How many talks have you seen at USENIX that care about UNIX as desktop OS?
Exactly.
When Kon Colivas introduced a scheduler optimized for desktop latency, about 15 years ago, the amount of abuse he got from Linux developers was astonishing, and he ended up quitting for good. I remember compiling it on my laptop and noticing how it made a huge improvement in the useability of X and desktop environment.
This also really bugs me. To be fair, the gcc people are unkind to both servers and laptops.
This isn't what prioritizing throughput actually looks like in most scenarios.
In the example you gave the amount of read speed the user needs to keep up with a video is meager and greater read speed is meaningless beyond maintaining a small buffer.
You in fact notice more if your process is sometimes starved of CPU IO memory was waiting on swap etc. Conversely you would in most cases not notice near so much if the entire thing got slower even much slower if it's meager resources were quickly available to the thing you are doing right now.
What's every day?
Exactly, lots of different things.
When I alt-tab I care about latency.
When I ssh I care about latency.
When I download a 25GB game I care about throughput for the download to a certain extent that is probably mainly ISP bound rather than local system bound. I don't care if the download takes 10 or 11 minutes as long as I can still use my system with zero delays meanwhile. And whether it takes 11 minutes of 3 hours depends on my ISP mostly. But being responsive to me while it downloads is local latency bound.
The Youtube example you have makes sense, sure.
Just want to point out that race conditions are a correctness problem, not a performance problem.
Accurate a.k.a. "correct" implementation of ACID needs a single (central) source of truth and temporal serializability (or something close to that).
In practice this always "impacts" performance.
If I understand it correctly, then in physics this is called an event horizon.
Not necessarily. Most race conditions violate the `A` in ACID, but the finicky thing about atomicity is that N > 1 sequential actions that in and of themselves are atomic violates atomicity. So any atomic store is possible to misuse if you can compose multiple atomic operations on it.
In addition ACID isn't always provided by the floor beneath your programs but by designing the programs on top to uphold it and/or not require it, allowing you to relax the constraints from your lower level interfaces for performance reasons.
Firstly, atomicity and/or thread-safety not composing is where the Consistency and Isolation come in.
The "application layer" always has to enforce its own consistency guarantees. If the lower layers are total garbage, then the system is garbage. And the "speed" of the lower layers can be infinitely fast and it doesn’t matter, if the application has a latency floor. So optimize it all you want.
Coreutils are not only used in interactive contexts. They are the primitives that make up the countless shell scripts which glue systems together. Any edge case will be encountered and the resulting poor performance will impact somebody, somewhere.
Here's a related example of what happens when you change a shell primitive's behavior - even interactively. Back in the 2000s, Linux distributions started adding color output to the ls command via a default "alias ls=/bin/ls --color=auto". You know: make directories blue, symlinks cyan, executables purple; that kind of thing. Somebody thought it would be a nice user experience upgrade.
I was working at a NAS (NFS remote box) vendor in tech support. We frequently got calls from folks who had just switched to Linux from Solaris, or had just moved their home directories from local disk to NFS. They would complain that listing a directory with a lot of files would hang. If it came back at all, it would be in minutes or hours! The fix? "unalias ls". Because calling "/bin/ls" would execute a single READDIR (the NFS RPC), which was 1 round-trip to the server and only a few network packets; but calling "/bin/ls --color=auto" would add a STAT call for every single file in the directory to figure out what color it should be - sequentially, one-by-one, confirming the success of each before the next iteration. If you had 30,000 files with a round-trip time of 1ms that's 30 seconds. If you had millions...well, either you waited for hours or you power-cycled the box. (This was eventually fixed with NFSv3's READDIRPLUS.)
Now I'm sure whomever changed that alias did not intend it, but they caused thousands of people thousands of hours of lost productivity. I was just one guy in one org's tech support group, and I saw at least a dozen such cases, not all of which were lucky enough to land in the queue of somebody who'd already seen the problem.
So I really appreciate GNU coreutils' commitment to sane behavior even at the edges. If you do systems work long enough, you will ride those edges, and a tool which stays steady in your hand - or script - is invaluable.
In short, NFS has a terrible data model and only pretends to be a file system.
Hence why even on UNIX people moved on from NFS, but on Linux it keeps being the remote filesystem many reach for.
For me it was the path of least resistance, I do use WebDAV more now since Copyparty supports it out of the box but I would be open to suggestions
Samba/SMB, Network protocols like WebDAV, S3, Docker, OneDrive,....
NFS is more annoying on Linux than just using Samba though, at least for the NAS use case. With Samba on my server I can just browse to it in KDE's file manager Dolphin, and samba configuration is a relatively straight forward ini style file on the server. A pair of ports also need to be opened in the host firewall.
Contrast that with NFS, which last I looked needed several config files, matching account IDs between hosts, mounting as root, and would hang processes if connection was lost. At least I hear rpcbind is gone these days.
I don't think anyone sane uses NFS on Linux either these days. And it is rather funny that the protocol Microsoft invented is what stuck and became practical between Linux hosts.
NetApp has NFS support and is widely used.
First thing I have heard about NetApp. Seems to be some enterprise focused company, with more than one product. Not sure which product of theirs you refer to.
Synology, TrueNAS and Proxmox probably also have NFS support I would assume, and they definitely have Samba. Those are more relevant to me personally.
I just run a normal headless Linux distro on my NAS computer, I don't see the point of a specialised NAS distro. It too could have NFS if I wanted it, but it currently has Samba, because it is easier and works better.
So in conclusion, I'm not sure what your point is? Doesn't NetApp support anything except NFS?
No, any remote system would have the same problem if one expected to use it as if it were local.
Not quite. For persistence latency, yes.
For read-only access there could be way better caching, especially for common use cases like listing the contents of a filesystem directory. But stuff like this was excluded on purpose.
NFS is really stupid.
NFS made the assumption that a distributed system with over 100 times the latency of a local system could be treated like a local system in every single way.
I am not sure why this means why "NFS is really stupid" if the user assumes that a distributed file system can be treated just like a local system. That is provides the same interface is what makes NFS extremely useful.
Additional point:
The point of data storage is to be a singleton.
(Backups are desireable, anyhow.)
To be fair, Vec::set_len bug in Rust was in 2021. And even then it had to be annotated as `unsafe`. It was then deprecated and a linter check was added: https://github.com/rust-lang/rust-clippy/issues/7681
To be even fair-er, it wasn't actually memory unsafety, it was "just" unsoundness, there was a type, that IF you gave it an io reader implementation that was weird, that implementation could see uninit data, or expose uninit data elsewhere, but the only readers actually used were well behaved readers.
> well behaved readers.
Around and around we go.
Vec::set_len is by no means deprecated. The lint you linked only covers a very specific unsound pattern using set_len.
Indeed, and it doesn't need to be deprecated, because it's an API explicitly designed to give you low-level control where you need it, and because it is appropriately defined as an `unsafe` function with documented safety invariants that must be manually upheld in order for usage to be memory-safe. The documentation also suggests several other (safe) functions that should be used instead when possible, and provides correct usage examples: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set... .
> and because it is appropriately defined as an `unsafe` function with documented safety invariants that must be manually upheld in order for usage to be memory-safe.
Didn't we learn from c, and the entire raison detre for rust, is that coders cannot be trusted to follow rules like this?
If coders could "(document) safety invariants that must be manually upheld in order for usage to be memory-safe." there's be no need for Rust.
This is the tautology underlying rust as I see it
No, this is mistaken. Rust provides `unsafe` functions for operations where memory-safety invariants must be manually upheld, and then forces callers to use `unsafe` blocks in order to call those functions, and then provides tooling for auditing unsafe blocks. Want to keep unsafe code out of your codebase? Then add `#![forbid(unsafe_code)]` to your crate root, and all unsafe code becomes a compiler error. Or you could add a check in your CI that prevents anyone from merging code that touches an unsafe block without sign-off from a senior maintainer. And/or you can add unit tests for any code that uses unsafe blocks and then run those tests under Miri, which will loudly complain if you perform any memory-unsafe operations. And you can add the `undocumented_unsafe_comment` lint in Clippy so that you'll never forget to document an unsafe block. Rust's culture is that unsafe blocks should be reserved for leaf nodes in the call graph, wrapped in safe APIs whose usage does not impose manual invariant management to downstream callers. Internally, those APIs represent a relatively miniscule portion of the codebase upon which all your verification can be focused. So you don't need to "trust" that coders will remember not to call unsafe functions needlessly, because the tooling is there to have your back.
> Want to keep unsafe code out of your codebase?
And how is this feasible for a systems language? Rust becomes too impotent for its main use case if you only use safe rust.
My original point still stands... Coders historically cannot be trusted to manually manage memory, unless they're rust coders apparently
> So you don't need to "trust" that coders will remember not to call unsafe functions needlessly, because the tooling is there to have your back.
By definition, it isn't possible for a tool to reason about unsafe code, otherwise the rust compiler would do it
> And how is this feasible for a systems language? Rust becomes too impotent for its main use case if you only use safe rust.
No, this is completely incorrect, and one of the most interesting and surprising results of Rust as an experiment in language design. An enormous proportion of Rust codebases need not have any unsafe code of their own whatsoever, and even those that do tend to have unsafe blocks in an extreme minority of files. Rust's hypothesis that unsafe code can be successfully encapsulated behind safe APIs suitable for the vast majority of uses has been experimentally proven in practice. Ironically, the average unsafe block in practice is a result of needing to call a function written in C, which is a symptom of not yet having enough alternatives written in Rust. I have worked on both freestanding OSes and embedded applications written in Rust--both domains where you would expect copious usage of unsafe--where I estimate less than 5% of the files actually contained unsafe blocks, meaning a 20x reduction in the effort needed to verify them (in Fred Brooks units, that's two silver bullets worth).
> Coders historically cannot be trusted to manually manage memory, unless they're rust coders apparently
Most Rust coders are not manually managing memory on the regular, or doing anything else that requires unsafe code. I'm not exaggerating when I say that it's entirely possible to have spent your entire career writing Rust code without ever having been forced to write an `unsafe` block, in the same way that Java programmers can go their entire career without using JNI.
> By definition, it isn't possible for a tool to reason about unsafe code, otherwise the rust compiler would do it
Of course it is. The Rust compiler reasons about unsafe code all the time. What it can't do is definitely prove many properties of unsafe code, which is why the compiler conservatively requires the annotation. But there are dozens of built-in warnings and Clippy lints that analyze unsafe blocks and attempts to flag issues early. In addition, Miri provides an interpreter in which to run unsafe code which provides dynamic rather than static analysis.
> No, this is completely incorrect,
Show me system level rust code that only uses safe then... You can't because its impossible. I doesn't matter that it's a minority of files (!), the simple fact is you can't program systems without using unsafe. Rewrite the c dependencies in rust and the amount of unsafe code increases massively
> Most Rust coders are not manually managing memory on the regular
Another sidestep. If coders in general cannot be trusted to manage memory, why can a rust coder be trusted all of a sudden?
> . But there are dozens of built-in warnings and Clippy lints that analyze unsafe blocks and attempts to flag issues early.
We already had that, it wasn't enough, hence..... rust, remember?
You are missing the forest for the trees here. The goal of that's unsafe isn't to prevent you from writing unsafe code. It's to prevent you from unsafe code by accident. That was always the goal. If you reread the comments through that lens I'm sure they'll make more sense.
I think you’re deliberately being obtuse here, and if you don’t see why, you should probably reflect on your reasoning.
I’ve been using Rust for about 12 years now, and the only times I’ve had to reach for `unsafe` was to do FFI stuff. That’s it. Maybe others might have more unsafe code and for good reasons, but from my perspective, I don’t know wtf you’re talking about.
> Maybe others might have more unsafe code and for good reasons, but from my perspective, I don’t know wtf you’re talking about
"well I don't need to use unsafe that much so I don't know what your point is" sounds like you don't really have an answer.
The issue with C is that every single use of a pointer needs to come with safety invariants (at its most basic: when you a pass a pointer to my function, do I. take ownership of your pointer or not?). You cannot legitimately expect people to be that alert 100% of the time.
Inversely, you can write whole applications in rust without ever touching `unsafe` directly, so that keyword by itself signals the need for attention (both to the programmer and the reviewer or auditor). An unsafe block without a safety comment next to it is a very easy red flag to catch.
>when you a pass a pointer to my function, do I take ownership of your pointer or not?
It's honestly frustrating how prevalent this is in C, and the docs don't even tell you this, and if you guess it does take ownership and make a copy for it and you were wrong, now you just leaked memory, or if you guessed the other way now you have the potential to double-free it, use after free, or have it mutated behind your back.
Rust has never been about outright eliminating unsafe code, it's about encapsulating that unsafe code within a safe externally usable API.
When creating a dynamic sized array type, it's much simpler to reason about its invariants when you assume only its public methods have access to its size and length fields, rather than trust the user to remember to update those fields themselves.
The above is an analogy which is obviously fixed by using opaque accesor functions, but Rust takes it further by encapsulating raw pointer usage itself.
The whole ethos of unsafe Rust is that you encapsulate usages of things like raw pointers and mutable static variables in smaller, more easily verifiable modules rather than having everyone deal with them directly.
The specific use case the GNU maintainer listed followed this exact pattern.
Probably a dumb question, but is GNU Core utils interested in / planning on doing its own rust rewrite?
The rewrite in Rust is mostly vanity and marketing but not based on a real technical need...
So I don't see why they would want to do that.
Canonical's usage of uutils is likely for marketing. But the codebase itself was developed for fun, as an excuse for people to have a hands-on way to learn Rust back before Rust was even released, with a minor justification as being cross-platform. From the original README in 2013:
Why?
----
Many GNU, linux and other utils are pretty awesome, and obviously some effort has been spent in the past to port them to windows. However those projects are either old, abandonned, hosted on CVS, written in platform-specific C, etc.
Rust provides a good platform-agnostic way of writing systems utils that are easy to compile anywhere, and this is as good a way as any to try and learn it.
https://github.com/uutils/coreutils/blob/9653ed81a2fbf393f42...
>Canonical's usage of uutils is likely for marketing
Currently their usage is actively worsening the security of their distro
These things were caught and basically all of them weren't covered by any test suite (not even GNU coreutils'). It's a bit bold to claim that it's actively worsening it when it's not an LTS.
That's generally what you call introducing new semantic bugs.
> It's a bit bold to claim that it's actively worsening it when it's not an LTS.
It is LTS now. And not LTS releases are releases.
Welcome to building something new.
New things can be made optional and tested outside production, and should not be rolled out in an LTS edition.
Isn't this how Kernighan and late Ritchie (K&R) ended up with unix and C?
Honestly, brilliant guys.
When C got its own standards committee they even rejected Ritchie's proposal to add fat pointers to C before it was too late to add them. Instead, we got the C abstract machine.
I thought it was a learning exercise, and maybe some corporations also like it because it has more permissive licensing.
Thomas Jefferson famously said that "A coreutils rewrite every now and again is a good thing". Or something like that.
When I was a beta tester for System Vr2 Unix, I collected as many bug reports as possible from Usenet (I used the name "the shell answer man". Looking back I conclude that arrogance is generally inversely proportional to age) and sent a patch for each one I could verify. Something like 100 patches.
So if this rust rewrite cleans up some issues, it's a good thing.
At the current moment I would be against it. The language and library is changing too fast. Also, Rust has some other things that make it hard to use for coreutils. For example, Rust programs always call signal (SIGPIPE, SIG_IGN) or equivalent code before main(). There is no stable way to get the longstanding behavior of inheriting the signal action from the parent process [1]. This is quite annoying, but not unique to Rust [2].
[1] https://doc.rust-lang.org/beta/unstable-book/compiler-flags/... [2] https://www.pixelbeat.org/programming/sigpipe_handling.html
I think the concern is that the writing may be on the wall for (the current memory-unsafe version of) Coreutils. Despite the bugs and incompatibilities, Canonical seems to have decided that the memory safety of uutils is worth it. And those two downsides, the bugs and incompatibilities, will likely attenuate quickly, compelling the other distros to follow suit in adopting uutils before long.
So the continued popularity of Coreutils might, I think, depend on Coreutil's near-term publicly announced and actual memory safety strategy. As I suggested in my other comment, there are (somewhat nascent) options for memory safety that do not require a rewrite of the code base. (For linux x86_64 platforms, depending on your requirements, that might include the "fanatically compatible" Fil-C.) And given the high profile of Coreutils, there are likely people willing to work with the Coreutils team to help in the deployment of those memory safety options.
I see even the coreutils maintainers find themselves needing -n (no newlines) and -c (count) options to "yes".
GNU coreutils is known for adding command libe options.
One of the big philosophical differences to the BSD's.
For a human being, it sucks both ways.
I don't know if you're aware, but there is a demonstration of wget (a fellow "gnu utility", right?) being auto-translated to a memory-safe subset of C++ [1]. Because the translation essentially does a one-for-one substitution of potentially unsafe C elements with safe C++ counterparts that mirror the behavior, the translation should be much less susceptible to the introduction of new bugs and behaviors in the way a rewrite would be.
With a little cleaning-up of the original code, the code translation ends up being fully automatic and so can be used as a build step to produce (slightly slower) memory-safe executables from the original C source.
[1] https://duneroadrunner.github.io/scpp_articles/PoC_autotrans...
Filesystem access is mostly treated by users as serialized ACID transactions on "files in directories."
"Managing this resource centrally" is where unix syscalls came from. An OS kernel can be used like a specialized library for ACID transactions on hardware singletons.
People then got fancy with virtual memory, interrupts, signals, time-slicing, re-entrancy, thread-safety, and injectivity.
It doesn’t matter, whether you call the "kernel library" from C, C++, Fortan, BASIC, Golang, bash, Rust, etc.
>the article says "The Rust rewrite has shipped zero of these [memory saftey bugs], over a comparable window of activity." However, this is not true
That bug got fixed before the Ubuntu release, and is from way before Canonical was even involved with the project.
In the given list of GNU CVEs in the original article, it included a buffer overrun in tail from 2021. So for a fair comparison 2021 is part of the "window of activity" (the year uu_od CVE was published).
Indeed, std::fs suffers from being a lowest common denominator. Rust had to have something at 1.0, and unfortunately it stayed like that.
Rust uutils would be a good place to design a more foolproof replacement for Rust's std::fs API.
Unix embodies this, as well.
When K&R created unix and C there was still the better option of moving changes that were better to have in the "kernel" into the kernel.
Now we have "standards" that even cause headaches between Linux and BSD's.
Linux back-propagates stuff like mmap, io_uring, etc. to where it belongs. In this way it is like the original unix. And deservedly running on most servers out there.
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls. Most of those mistakes are exceedingly amateur from the perspective of long-time GNU coreutils (or BSD or Solaris base) developers, issues that were identified and largely hashed out decades ago, notwithstanding the continued long tail of fixes--mostly just a trickle these days--to the old codebases.
More than that: it seems that Rust stdlib nudges the developer towards using neat APIs at an incorrect level of abstraction, like path-based instead of handle-based file operations. I hope I'm wrong.
Nearly every available filesystem API in Rust's stdlib maps one-to-one with a Unix syscall (see Rust's std::fs module [0] for reference -- for example, the `File` struct is just a wrapper around a file descriptor, and its associated methods are essentially just the syscalls you can perform on file descriptors). The only exceptions are a few helper functions like `read_to_string` or `create_dir_all` that perform slightly higher-level operations.
And, yeah, the Unix syscalls are very prone to mistakes like this. For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle; and so Rust has a `rename` function that takes two paths rather than an associated function on a `File`. Rust exposes path-based APIs where Unix exposes path-based APIs, and file-handle-based APIs where Unix exposes file-handle-based APIs.
So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
[0]: https://doc.rust-lang.org/std/fs/index.html
> For example, Unix's `rename` syscall takes two paths as arguments; you can't rename a file by handle
And then there’s renameat(2) which takes two dirfd… and two paths from there, which mostly has all the same issues rename(2) does (and does not even take flags so even O_NOFOLLOW is not available).
I’m not sure what you’d need to make a safe renameat(), maybe a triplet of (dirfd, filefd, name[1]) from the source, (dirfd, name) from the target, and some sort of flag to indicate whether it is allowed to create, overwrite, or both.
As the recent https://blog.sebastianwick.net/posts/how-hard-is-it-to-open-... talks about (just for file but it applies to everything) secure file system interaction is absolutely heinous.
[1]: not path
How about fd of the file you wanna rename, dirfd of the directory you want to open it in, and name of the new file? You could then represent a "rename within the same directory" as: dfd = opendir(...); fd = openat(dfd, "a"); rename2(fd, dfd, "b");
I can't think of a case this API doesn't cover, but maybe there is one.
The file may have been renamed or deleted since the fd was opened, and it might have been legitimate and on purpose, but there’s no way to tell what trying to resolve the fd back to a path will give you.
And you need to do that because nothing precludes having multiple entries to the same inode in the same directory, so you need to know specifically what the source direntry is, and a direntry is just a name in the directory file.
> So I agree that Rust's stdilb is somewhat mistake prone; not so much because it's being opinionated and "nudg[ing] the developer towards using neat APIs", but because it's so low-level that it's not offering much "safety" in filesystem access over raw syscalls beyond ensuring that you didn't write a buffer overflow.
`openat()` and the other `*at()` syscalls are also raw syscalls, which Rust's stdlib chose not to expose. While I can understand that this may not be straight forward for a cross-platform API, I have to disagree with your statement that Rust's stdlib is mistake prone because it's so low-level. It's more mistake prone than POSIX (in some aspects) because it is missing a whole family of low-level syscalls.
They're not missing, Rust just ships them (including openat) as part of the first-party libc crate rather than exposing them directly from libstd. You'll find all the other libc syscalls there as well: https://docs.rs/libc/0.2.186/libc/ . I agree that Rust's stdlib could use some higher-level helper functions to help head off TOCTOU, but it's not as simple as just exposing `openat`, which, in addition to being platform-specific as you say, is also error-prone in its own right.
But those are all unsafe, taking raw strings.
Why can I easily use "*at" functions from Python's stdlib, but not Rust's?
They are much safer against path traversal and symlink attacks.
Working safely with files should not require *const c_char.
This should be fixed .
> But those are all unsafe, taking raw strings.
The parent was asking for access to the C syscall, and C syscalls are unsafe, including in C. You can wrap that syscall in a safe interface if you like, and many have. And to reiterate, I'm all for supporting this pattern in Rust's stdlib itself. But openat itself is a questionable API (I have not yet seen anyone mention that openat2 exists), and if Rust wanted to provide this, it would want to design something distinct.
> Why can I easily use "*at" functions from Python's stdlib, but not Rust's?
I'm not sure you can. The supported pattern appears to involve passing the optional `opener` parameter to `os.open`, but while the example of this shown in the official documentation works on Linux, I just tried it on Windows and it throws a PermissionError exception because AFAIK you can't open directories on Windows.
> AFAIK you can't open directories on Windows.
You can but you have to go through the lower level API: NtCreateFile can open a directory, and you can pass in a RootDirectory handle to following calls to make them handle-relative.
You can open directories using high level win32 APIs. What you need NtCreateFile for is opening files relative to an open directory.
I took parent's message to be asking why the standard library fs primitives don't use `at` functions under the hood, not that they wanted the `at` functions directly exposed.
> which Rust's stdlib chose not to expose
i.e. expose through things like `File::open()`.
> why the standard library fs primitives don't use `at` functions under the hood
In this case it wouldn't seem to make sense to use `at` functions to back the standard file opening interface that Rust presents, because it requires different parameters, so a different API would need to be designed. Someone above mentioned that such an API is being considered for inclusion in libstd in this issue: https://github.com/rust-lang/rust/issues/120426
The nix crate provides the safe wrappers. https://docs.rs/nix/latest/nix/fcntl/fn.openat2.html
The correct comparison is to rustix, not libc, and rustix is not first-party. And even then the rustix API does not encapsulate the operations into structs the same way std::fs and std::io do.
The correct comparison to someone asking for first-party access to a C syscall is to the first-party crate that provides direct bindings to C syscalls. If you're willing to go further afield to third-party crates, you might as well skip rustix's "POSIX-ish" APIs (to quote their documentation) and go directly to the openat crate, which provides a Rust-style API.
If I have to use unsafe just to open a file, I might as well use C. While Rustix is a happy middle that is usually enough and more popular than the open at crate, libc is in the same family as the "*-sys" crate and, generally speaking, it is not intended for direct use outside other FFI crates.
I agree it’d be nice if there were a safe stdlib openat API, but
> If I have to use unsafe just to open a file, I might as well use C.
is a ridiculous exaggeration.
I agree it is an exaggeration in that of course you could write a wrapper. The point was that if everyone had to write their own FFI wrappers, Rust wouldn't go far and openat is not an exception.
There is code available at the right level of abstraction (the rustix or openat crates), and while it's not managed by the Rust team, uutils already have many third party dependencies. Bringing up libc just because it's first party, instead, is comparing apple to oranges.
openat() is there, but it's unstable (because the dirfd-related syscalls are not all fully implemented and tested across all platforms Rust supports yet): https://doc.rust-lang.org/std/fs/struct.Dir.html#method.open...
There are lots of unstable things in Rust that have been unstable for many years, and the intentional segregating of unstable means that it's a nonstarter for most use cases, like libraries. It's unstable because there's significant enough issues that nobody wants to mark it as stable, no matter what those issues are.
As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere.
There are lots of unstable things in Rust that have been unstable for many years, but this isn't one of them. openat() was added in September, and the next PR in the series implementing unlinkat() and removeat() received a code review three weeks ago and is currently waiting on the author for minor revisions.
> As long as it's unstable it's totally fair to say Rust's stdlib does not expose them. You might as well say it's fixed because someone posted a patch on a mailing list somewhere
Agreed. My comment was intended to be read as "it's planned and being worked on", not "it's available".
After reading this article, I'm inclined to think that the right thing for this project to do is write their own library that wraps the Rust stdlib with a file-handle-based API along with one method to get a file handle from a Path; rewrite the code to use that library rather than rust stdlib methods, and then add a lint check that guards against any use of the Rust standard library file methods anywhere outside of that wrapper.
If that's the right approach, then it would be useful to make that library public as a crate, because writing such hardened code is generally useful. Possibly as a step before inclusion in the rust stdlib itself.
Agreed. (This approach feels like a cousin of Parse, Don't Validate.)
Yeah. The idea is, if you're consistently making mistakes because the most convenient API at your disposal (here, the rust standard library file/directory APIs that are based around Paths), then after you fix the actual bugs you should write a better abstraction and then deliberately add friction around not using that better abstraction to try to constrain future developers (including future-you) from using the more-error-prone abstraction.
Parse, don't validate is also a principle that encourages people to use a less-error-prone abstraction (the parsed data structure or an error representing invalid input), rather than a more-error-prone one (the original untyped data with ad-hoc validations at various call sites).
If anything, I find the rust standard library to default to Unix too much for a generic programming language. You need to think very Unixy if you want to program Rust on Windows, unless you're directly importing the Windows crate and foregoing the Rust standard library. If you're writing COBOL style mainframe programs, things become even more forced, though I doubt the overlap between Rust programmers and mainframe programmers that don't use a Unix-like is vanishingly small.
This can also be a pain on microcontrollers sometimes, but there you're free to pretend you're on Unix if you want to.
That's the same for the C or Python standard libraries. The difference is that in C you tend to use the Win32 functions more because they're easily reached for; but Python and Rust are both just as Unixy.
Indeed, though for C it makes sense given its origins, and Python sort of grew from a fun project into a massive ecosystem by accident.
If you want to support file I/O in the standard library, you have to choose _some_ API, and that either is limited to the features common to all platforms, or it covers all features, but call that cannot be supported return errors, or you pick a preferred platform and require all other platforms to try as hard as they can to mimic that.
Almost all languages/standard libraries pick the latter, and many choose UNIX or Linux as the preferred platform, even though its file system API has flaws we’ve known about for decades (example: using file paths too often) or made decisions back in 1970 we probably wouldn’t make today (examples: making file names sequences of bytes; not having a way to encode file types and, because of that, using heuristics to figure out file types. See https://man7.org/linux/man-pages/man1/file.1.html)
You have to choose something, and I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level"). You can usually abstract away platform weirdness at the implementation level, but programming on non-Unix environments it's more like programming against cygwin.
A standard library for files and paths that lacks things like ACLs and locks is weirdly Unixy for a supposedly modern language. Most systems support ACLs now, though Windows uses them a lot more. On the other hand, the lack of file descriptors/handles is weird from all points of view.
Had Windows been an uncommon target, I would've understood this design, but Windows is still the most common PC operating system in the world by a great margin. Not even considering things like "multile filesystem roots" (drive letters) "that happen to not exist on Linux", or "case insensitive paths (Windows/macOS/some Linux systems)" is a mistake for a supposedly generic language, in my opinion.
As far as I can tell from Microsoft's documentation, WinAPI access for ACLs was added in Windows 10, which Rust 1.0 predates. And std::fs attempts to provide both minimalist and cross-platform APIs, which in practice means (for better or worse) it's the lowest common denominator between Windows and Unix, with the objective being that higher-level libraries can leverage it as a building block. From the documentation for std::fs:
"This module contains basic methods to manipulate the contents of the local filesystem. All methods in this module represent cross-platform filesystem operations. Extra platform-specific functionality can be found in the extension traits of std::os::$platform."
Following its recommendation, if we look at std::os::windows::fs we see an extension trait for setting Windows-specific flags for WinAPI-specific flags, like dwDesiredAccess, dwShareMode, dwFlagsAndAttributes. I'm not a Windows dev but AFAICT we want an API to set lpSecurityAttributes. I don't see an option for that in std::os::windows::fs, likely complicated by the fact that it's a pointer, so acquiring a valid value for that parameter is more involved than just constructing a bitfield like for the aforementioned parameters. But if you think this should be simple, then please propose adding it to std::os::windows::fs; the Rust stdlib adds new APIs all the time in response to demand. (In the meantime, comprehensive Windows support is generally provided by the de-facto standard winapi crate, which provides access to the raw syscall).
> WinAPI access for ACLs was added in Windows 10
I'm not sure which docs you mean but that's not true. The NT kernel has used ACLs long before rust was invented. But it's indeed true that rust adds platform-specific methods based on demand. The trouble with ACLs is it means either creating a large API surface in the standard library to handle them or else presenting a simple interface but having to manage raw pointers (likely using a wrapper type but even then it can't be made totally safe).
> the de-facto standard winapi crate, which provides access to the raw syscall
Since the official Microsoft `windows-sys` crate was released many years ago, the winapi crate has been effectively unmaintained (it accepts security patches but that's it).
As far as I can tell even NFS got ACL support before the first Rust release. NFSv4.1 in 2010 vs Rust in 2012.
> I'm not sure which docs you mean
I was looking at these: https://learn.microsoft.com/en-us/windows/security/identity-...
> the winapi crate has been effectively unmaintained
Shows how much of a Windows dev I am. :P
You'd want to be looking at these[1] instead, especially SetFileSecurity[2].
As noted, the "minimum supported" version means exactly that, and does not reflect when the API function was introduced.
[1]: https://learn.microsoft.com/en-us/windows/win32/secauthz/low...
[2]: https://learn.microsoft.com/en-us/windows/win32/api/winbase/...
You misunderstand the documentation. Microsoft doesn't provide online documentation for versions of Windows that are no longer supported. Functions like SetFileSecurity have existed since Windows NT 3.1 back in 1993.
But the documentation I'm using claims that it applies to Windows 10, which stopped being supported last year.
Windows 10 support is still available to people who pay for it.
SetFileSecurityA is listed as Windows XP+ (https://learn.microsoft.com/en-us/windows/win32/api/winbase/...) but Microsoft has deprecated all pre-XP documentation.
According to https://www.geoffchappell.com/studies/windows/win32/advapi32..., the function was available first in advapi32 version 3.10, which was included in Windows NT 3.10 (14th July 1993): https://www.geoffchappell.com/studies/windows/win32/advapi32...
lpSecurityAttributes just refers to a SecurityAttributes struct (Rust bindings here: https://microsoft.github.io/windows-docs-rs/doc/windows/Win3...) Annoying pointers for sure, but nothing a Rust API can't work around with standard language features.
And sure, Rust could add the entire windows crate to the standard library, but my point is that this isn't just Windows functionality: getfacl/setfacl has been with us for decades but I don't know any standard library that tries to include any kind of ACLs.
> I'm glad they didn't go with the idiotic Go approach ("every path is a valid UTF-8 string" or we just garble the path at the standard library level")
Can you expound a bit on this? I haven't been able to find any articles related to this kind of problem. It's also a bit surprising, given that Go specifically did not make the same choice as Rust to make strings be Unicode / UTF-8 (Go strings are just arrays of bytes, with one minor exception related to iteration using the range syntax).
Go's docs put it like this: Path names are UTF-8-encoded, unrooted, slash-separated sequences of path elements, like “x/y/z”. If you operate on a path that's a non-UTF-8 string, then Go will do... something to make the string work with UTF-8 when passed back to standard file methods, but it likely won't end up operating on the same file.
Rust has OsStr to represent strings like paths, with a lossy/fallible conversion step instead.
Go's approach is fine for 99% of cases, and you're pretty screwed if your application falls for the 1% issue. Go has a lot of those decisions, often to simplify the standard library for most use cases most people usually run into (like their awful, lossy, incomplete conversion between Unix and Windows when it comes to permissions/read-only flags/etc.).
> Path names are UTF-8-encoded, unrooted, slash-separated sequences of path elements, like “x/y/z”
This is only for the "io/fs" package and its generic filesystem abstractions. The "os" package, which always operates on the real filesystem, doesn't actually specify how paths are encoded, nor does its associated helper package "path/filepath".
In practice, non-UTF-8 already wasn't an issue on Unix-like systems, where file paths are natively just byte sequences. You do need to be aware of this possibility to avoid mangling the paths yourself, though. The real problem was Windows, where paths are actually WTF-16, i.e. UTF-16 with unpaired surrogates. Go has addressed this issue by accepting WTF-8 paths since Go 1.21: https://github.com/golang/go/issues/32334#issuecomment-15500...
> Go strings are just arrays of bytes,
https://go.dev/ref/spec#String_types: “A string value is a (possibly empty) sequence of bytes”
https://pkg.go.dev/strings@go1.26.2: “Package strings implements simple functions to manipulate UTF-8 encoded strings.”
So, yes, Go strings are just arrays of bytes in the language, but in the standard library, they’re supposed to be UTF-8 (the documentation isn’t immediately clear on how it handles non-UTF-8 strings).
I think this may be why the OP thinks the Go approach is “every path is a valid UTF-8 string”
That's a norm in most languages, this is just more convenient way to operate
Unfortunately, it's not the Rust stdlib, it's nearly every stdlib, if not every one. I remember being disappointed when Go came out that it didn't base the os module on openat and friends, and that was how many years ago now? I wasn't really surprised, the *at functions aren't what people expect and probably people would have been screaming about "how weird" the file APIs were in this hypothetical Go continually up to this very day... but it's still the right thing to do. Almost every language makes it very hard to do the right thing with the wrong this so readily available.
I'm hedging on the "almost" only because there are so many languages made by so many developers and if you're building a language in the 2020s it is probably because you've got some sort of strong opinion, so maybe there's one out there that defaults to *at-style file handling in the standard library because some language developer has the strong opinions about this I do. But I don't know of one.
Openat appeared in Linux in 2006 but not in FreeBSD until 2009; go started being developed in 2007. It probably missed the opportunity by a year. It would have been the right thing to change the os module at some point in the last 18 years, however.
https://pkg.go.dev/os#Root
Someone once coined a related term, "disassembler rage". It's the idea that every mistake looks amateur when examined closely enough. Comes from people sitting in a disassembler and raging the high level programmers who had the gall to e.g. use conditionals instead of a switch statement inside a function call a hundred frames deep.
We're looking solely at the few things they got wrong, and not the thousands of correct lines around them.
When I read the article I came away with the impression that shipping bugs this severe in a rewrite of utils used by hundreds of millions of people daily (hourly?) isn’t ok. I don’t think brushing the bad parts off with “most of the code was really good!” is a fair way to look at this.
Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
I think that legitimate real world issues in rust code should be talked about more often. Right now the language enjoys a reputation that is essentiaöly misleading marketing. It isn't possible to create a programing language that doesn't allow bugs to happen (even with formal verification you can still prove correctness based on a wrong set of assumptions). This weird, kind of religious belief that rust leads to magically completely bug free programs needs to be countered and brought in touch with reality IMO.
Is it possible you’ve misunderstood what Rust promises?
> It isn't possible to create a programing language that doesn't allow bugs to happen
Yes, that’s true. No one doubts this. Except you seem to think that Rust promises no bugs at all? I don’t know where you got this impression from, but it is incorrect.
Rust promises that certain kinds of bugs like use-after-free are much, much less likely. It eliminates some kinds of bugs, not all bugs altogether. It’s possible that you’ve read the claim on kinds of bugs, and misinterpreted it as all bugs.
I’ve had this conversation before, and it usually ends like https://www.smbc-comics.com/comic/aaaah
"Rust" obviously does not promise that.
On the other hand, there are too many less-experienced Rust fans who do claim that "Rust" promises this and that any project that does not use Rust is doomed and that any of the existing decades-old software projects should be rewritten in Rust to decrease the chances that they may have bugs.
What is described in TFA is not surprising at all, because it is exactly what has been predicted about this and other similar projects.
Anyone who desires to rewrite in Rust any old project, should certainly do it. It will be at least a good learning experience and whenever an ancient project is rewritten from scratch, the current knowledge should enable the creation of something better than the original.
Nonetheless, the rewriters should never claim that what they have just produced has currently less bugs than the original, because neither they nor Rust can guarantee this, but only a long experience with using the rewritten application.
Such rewritten software packages should remain for years as optional alternatives to the originals. Any aggressive push to substitute the originals immediately is just stupid (and yes, I have seen people trying to promote this).
Moreover, someone who proposes the substitution of something as basic as coreutils, must first present to the world the results of a huge set of correctness tests and performance benchmarks comparing the old package with the new package, before the substitution idea is even put forward.
Where are these rust fans? Are they in the room with us right now?
You’ve constructed a strawman with no basis in reality.
You know what actual Rust fans sound like? They sound like Matthias Endler, who wrote the article we’re discussing. Matthias hosts a popular podcast Rust in Production where talks with people about sharp edges and difficulties they experienced using Rust.
A true Rust advocate like him writes articles titled “Bugs Rust Won’t Catch”.
> Such rewritten software packages should remain for years as optional alternatives to the originals.
This project was started a decade ago. (https://news.ycombinator.com/item?id=7882211)
> must first present to the world the results of a huge set of correctness tests and performance benchmarks
Yeah, you can see those in https://github.com/uutils/coreutils. This project has also worked with GNU coreutils maintainers to add more tests over time. Check out the graph where the total number of tests increases over time.
> before the substitution idea is even put forward
I partly agree. But notice that these CVEs come from a thorough security audit paid for by Canonical. Canonical is paying for it because they have a plan to substitute in the immediate future.
Without a plan to substitute it’s hard to advocate for funding. Without funding it’s hard to find and fix these issues. With these issues unfixed it’s hard to plan to substitute.
Chicken and egg problem.
> less bugs
Fewer.
Those Rust fans exist on almost all Internet forums that I have seen, including on HN.
I do not care about what they say, so I have not made a list with links to what they have posted. But even only on HN, I certainly have seen much more than one hundred of such postings, more likely at least several hundreds, even on threads that did not have any close relationship with Rust, so there was no reason to discuss Rust.
Since the shameless promotion with false claims of Java by Sun, during the last years of the previous century, there has not been any other programming language affected by such a hype campaign.
I think that this is sad. Rust has introduced a few valid innovations and it is a decent programming language. Despite this, whenever someone starts mentioning Rust, my first reaction is to distrust whatever is said, until proven otherwise, because I have seen far too many ridiculous claims about Rust.
Could you find one such person on this thread? Someone making ridiculous claims about what Rust offers.
I’ll tell you what I think you’ve seen - there are hundreds of threads where you’ve seen people claim they’ve seen this everywhere. That gives you the impression that it is universal.
This one probably covers it:
https://news.ycombinator.com/item?id=45921143
Perfect. Because that’s exactly what I’m saying.
The comment you linked says something specific about a specific kind of bug being eliminated - memory safety bugs. And they’re not making a claim, they’re repeating the evidence gathered from the Android codebase. So that’s a fact, memory safety bugs truly did not appear in the Rust parts of Android.
The comment you linked is not claiming Rust code is bug-free. That’s a strawman I’ve seen many, many times. Haters will claim that this happens all the time, but all I see are examples of the haters claiming this. You had to go back 5 months and still couldn’t find anything similar to the strawman.
> This one probably covers it
No, probably not.
The only language I've ever seen users make that claim for is Haskell. Rust users have never made the claim, but I've seen it a lot from advocates who appear to find "hello world" a complex hard to write program.
> On the other hand, there are too many less-experienced Rust fans who do claim that "Rust" promises this
Link some comments like this? Because I've been reading Rust discussions for years and never seen them.
I understand the (narrow) hard guarantees that rust gives. But there there are people in the wider community who think that the guarantees are much, much broader. This is a pretty widespread misconception that should get be rectified.
Who are these people? Care to share examples?
Because all I see are examples of people claiming it happens all the time. Not the examples of it actually happening.
Nobody believes Rust programs are but free, though. Rust never promised that. It doesn't even promise memory safety, it only promises memory safety if you restrict yourself to safe APIs which simply isn't always possible.
> it only promises memory safety if you restrict yourself to safe APIs which simply isn't always possible.
Less than that actually, considering Rust has its own definition of what "safe" means.
Ah, the Dwarf Fortress approach :)
https://dwarffortresswiki.org/DF2014:Fun&redirect=no
The NSA believe it's a memory safe language.
Or... the NSA wants you to think the NSA believes that rust is a memory safe language.
Or... the NSA wants you to think that the NSA wants you to think that the NSA believes that Rust is a memory-safe language, so that everyone who distrusts the NSA keeps using C.
I have never seen a comment claiming that Rust leads to magically completely bug free programs.
Could you please link one? Because I doubt it exists, or if it does, it is probably on some obscure website or downvoted to oblivion.
On the other hand, I see comments in every Rust thread that are basically restatements of yours attacking a strawman.
The reality: Rust does not prevent all bugs. In fact, it doesn't even prevent any bugs. What it actually does is make a certain particularly common and dangerous class of bugs much more difficult to write.
I find it hilarious that this comment is being downvoted.
Exactly what is the controversial take here?
> I don’t think brushing the bad parts off with “most of the code was really good!” is a fair way to look at this.
Nope. this is fine.
> Cloudflare crashed a chunk of the internet with a rust app a month or so ago, deploying a bad config file iirc.
Maybe this?
> Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are.
Nope, this is fine too.
I didn't downvote, but I feel the last two points show a lack of nuance. It's saying "Rust doesn't prevent 100% of the bugs, like all other programming languages", while failing to acknowledge that if a programming language prevents entire classes of bugs, it's a very significant improvement.
Nobody disputes that Rust is one of the programming languages that prevent several classes of frequent bugs, which is a valuable feature when compared with C/C++, even if that is a very low bar.
What many do not accept among the claims of the Rust fans is that rewriting a mature and very big codebase from another language into Rust is likely to reduce the number of bugs of that codebase.
For some buggier codebases, a rewrite in Rust or any other safer language may indeed help, but I agree with the opinion expressed by many other people that in most cases a rewrite from scratch is much more likely to have bugs, regardless in what programming language it is written.
If someone has the time to do it, a rewrite is useful in most cases, but it should be expected that it will take a lot of time after the completion of the project until it will have as few bugs as mature projects.
As other people have mentioned, the goal of uutils was not "let's reduce bugs in coreutils by rewriting it in Rust", it was "it's 2013 and here's a pre-1.0 language that looks neat and claims to be a credible replacement for C, let's test that hypothesis by porting coreutils, giving us an excuse to learn and play with a new language in the process". It seems worth emphasizing that its creation was neither ideologically motivated nor part of some nefarious GPL-erasure scheme, it was just some people hacking on a codebase for fun.
Whether or not it was wise for Canonical to attempt to then take that codebase and uplift it into Ubuntu is a different story altogether, but one that has no bearing on the motivations of the people behind the original port itself.
You can see an alternative approach with the authors of sudo-rs. Rather than porting all of userspace to Rust for fun, they identified a single component of a particularly security-critical nature (sudo), and then further justified their rewrite by removing legacy features, thereby producing an overall simpler tool with less surface area to attack in the first place. It was not "we're going to rewrite sudo in Rust so it has fewer bugs", it was "we're going to rewrite sudo with the goal of having fewer bugs, and as one subcomponent of that, we're going to use Rust". And of course sudo-rs has had fresh bugs of its own, as any rewrite will. But the mere existence of bugs does not invalidate their hypothesis, which is that a conscientious rewrite of a tool can result in fewer bugs overall.
But are the current uutils developers the same as the 2013 developers? At least based on GitHub's graphs, that's not the case (it looks fairly bimodal to me), and so it wouldn't be unreasonable to treat the 2013-era project differently to the 2020-era project. So judging the 2020-era project for its current and ongoing failures does not seem unreasonable.
Similarly, sudo-rs dropping "legacy" features leaves a bad taste in my mind, there are multiple privilege escalation tools that exist (doas being the first that comes to mind), and doing something better and not claiming "sudo" (and rather providing a compat mode ala podman for docker) would to me seem a better long term path than causing more breakage (and as shown by uutils, breakage on "core" utils can very easily lead to security issue).
I personally find uutils lack of care to be concerning because I've been writing (as a very low priority side project) a network utility in rust, and while it not aiming to be a drop in rewrite for anything, I would much rather not attract the same drama.
doas and sudo-rs occupy different niches, specifically doas aims for extreme minimalism and deliberately sacrifices even more compatibility than sudo-rs, which represents a middle ground.
> its creation was neither ideologically motivated nor part of some nefarious GPL-erasure scheme
No, they openly refuse to accept any GPL code. And even have a strict policy of not even reading GPL code.
No, once you have an MIT-licensed codebase without a copyright assignment scheme, you no longer have the freedom to relicense it at will. You could attempt to have a mixed-license codebase, which is supported by the GPL, and specify that all new contributions must accept the GPL, but this is tantamount to an incompatible fork of the project from the perspective of any downstream users, and anyone who insists on contributing code under the GPL has the freedom to perform this fork themselves.
This is simply false. You can accept GPL contributions and clearly indicate the names of the contributors as required by MIT. There is no "incompatible fork".
No, GPL and MIT have significantly different compliance requirements. You cannot suddenly begin shipping code with stricter compliance requirements to downstream users without potentially exposing them to legal liability.
> It seems worth emphasizing that its creation was neither ideologically motivated nor part of some nefarious GPL-erasure scheme, it was just some people hacking on a codebase for fun.
What the motivation and intent was in 2013 is not necessarily relevant to what the motivation and intent is now.
It's even less relevant to what the effect is: the goal may be to replace $FOO software with $BAR software, but as things stand right now $FOO is "GPL" and $BAR is "MIT".
So, yeah, I don't want them to succeed at their primary goal, because that replaces pro-user software with pro-business software.
It's not a low bar when C/C++/D are basically the only languages in which you can write certain kinds of programs.
Because the bugs were caused by programmer error, not anything inherent to rust. It was more notable due to cloudflare being a critical dependency for half the internet, but that particular issue could've happened in any language.
This kind of melodramatic reaction to rust code is fatiguing, honestly. Rust does not bill itself as some programming panacea or as a bug free language, and neither do any of the people I know using it. That's a strawman that just won't go away.
Rust applies constraints regarding memory use and that nearly eliminates a class of bugs, provided safe usage. And that's compelling to enough people that it warrants migration from other languages that don't focus on memory safety. Bugs introduced during a rewrite aren't notable. It happens, they get fixed, life moves on.
> caused by programmer error, not anything inherent to Rust
Your argument does not work as a praise for Rust because the bugs in any program are caused by programmer errors, except the very rare cases when there are bugs in the compiler tool chain, which are caused by errors of other programmers.
The bugs in a C or C++ program are also caused by programmer errors, they are not inherent to C/C++. It is rather trivial to write C/C++ carefully, in order to make impossible any access outside bounds, numeric overflow, use-after-free, etc.
The problem is that many programmers are careless, especially when they might be pressed by tight time schedules, so they make some of these mistakes. For the mass production of software, it is good to use more strict programming languages, including Rust, where the compiler catches as many errors as possible, instead of relying on better programmers.
I'm neither praising or admonishing rust. Did you read the parent comment or its parents' comment I was responding to at all?
(grandparent comment): "Cloudflare crashed a chunk of the internet with a rust app a month or so ago"
The actual bug had nothing to do with rust, yet rust is specifically brought up here.
(grandparent comment): "Rust isn’t a panacea, it’s a programming language. It’s ok that it’s flawed, all languages are."
No Rust programmer thinks it's a panacea! Rust has never advertised itself this way.
The cloudflare bug was the equivalent of an uncaught exception caused by a malformed config file. There's no recovery from a malformed config file - the software couldn't possibly have done its job. What's salient is that they were using an alternative to exceptions, because people were told exceptions were error-prone, and using this thing instead would make it easier to write bug-free code. But don't do the equivalent of not catching them!
And then, it turned out to not really be any better than exceptions.
Most Rust evangelism is like this. "In Rust you do X and this makes your code have fewer bugs!" Well no it doesn't. Manually propagating exceptions still makes the program crash and requires more typing, and doesn't emit a stack trace.
That was why I brought it up. I wasn't trying to be snarky or haughty. Thank you for filling in the gaps, I should have done that instead of the 1-liner.
If I'm not mistaken, in the Cloudflare case, both the Rust rewrite and the C++ original version crashed. The primary cause being the bad config file.
Yes, but the point was that rewriting something in Rust is not sufficient per se to prevent such bugs.
The goal claimed by all these rewrites is the elimination of bugs.
The "elimination of bugs" is not synonymous with "the elimination of all bugs". The way you're presenting it, any single bug in a rewrite would be grounds to consider the the entire endeavor a failure, which is a ridiculous standard.
There are plenty of strong arguments to be made against rewriting something in Rust, but this is a pretty weak one.
Thing is, these tools are so critical that even one error may cause systems to be compromised; rewriting them should never be taken lightly.
(Actually ideally there's formal verification tools that can accurately test for all of the issues found in this review / audit, like the very timing specific path changes, but that's a codebase on its own)
Is formal verification able to find most of these issues? I'm no expert on formal analysis, but I suspect most systems are not able to handle many of these errors. It seems more likely that the system will assume the file doesn't change between two syscalls - which seems to be the majority of issues. Modeling that possibility at least makes the formal system much harder to make.
Seems pretty impressive they rewrote the coreutils in a new language, with so little Unix experience, and managed to do such a good job with very little bugs or vulns. I would have expected an order of magnitude more at least.
Shows how good Rust is, that even inexperienced Unix devs can write stuff like this and make almost no mistakes.
Yes, it's the lack of Unix experience that's terrifying. So many of mistakes listed are rookie mistakes, like not propagating the most severe errors, or the `kill -1` thing. Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
> Why were people who apparently did not have much experience using coreutils assigned to rewrite coreutils?
From what I understand, "assigned" probably isn't the best way to put it. uutils started off back in 2013 as a way to learn Rust [0] way before the present kerfuffle.
[0]: https://github.com/uutils/coreutils/tree/9653ed81a2fbf393f42...
Yeah perhaps learning UNIX API's and Rust at the same time doesn't lead to a drop in replacement ready to be shipped in major distributions. Who whould have thunk it.
Strictly speaking it doesn't preclude eventually producing a production-ready drop-in replacement either, though evidently that needs a fresh set of eyes.
exactly this. I wrote one of them back then as a learning experience. some of the code I wrote is still intact, incredibly.
Why is it even possible to represent a negative PID, let alone treat the integer -1 as a PID meaning "all effective processes"? This seems like a mistake (if not a rookie mistake) in the Linux kernel API itself.
Pretty much all the rough edges being discussed here are design mistakes in Linux or Unix, and/or a consequence of using an unsafe language with limited abstractions and a weak type system. But because of ubiquity, this is everyone’s problem now.
You are right, but those who set for themselves the goal to substitute a Linux/UNIX package must implement programs that handle correctly all the quirks of the existing Linux/POSIX specifications.
If they do not like the design mistakes, great, they should set for themselves the goal to write a new operating system together with all base applications, where all these mistakes are corrected.
As long as they have not chosen the second goal, but the first, they are constrained by the existing interfaces and they must use them correctly, no matter how inconvenient that may be.
Anyone who learns English may be frustrated by many design mistakes of English, but they must still use English as it is spoken by the natives, otherwise they will not be understood.
-1 is a special case, a way to represent a PID with all bits set in a platform-independent way. It's not very clean, and it comes from ancient times when writing some extra code and storing an extra few bytes was way more expensive.
No, -1 is simply the process group with pgid 1:
https://stackoverflow.com/questions/392022/whats-the-best-wa...
The problem is that -DIGIT doubles as both "signal number" and process group. The right way to invoke kill for a process group however would be "kill [OPTS]... -- -PGID".
It feels a bit like a "better is better" language hitting all of the quirks of a "worse is better" environment.
Rewriting perfectly good code was a colossal mistake.
Not necessarily, but was the reasoning sound and have the tradeoffs been made? The website (https://uutils.github.io/) shows some reasonable "why"s (although I disagree with making "Rust is more appealing" a compelling reason, but that's just me (disclaimer: I don't like C and don't know Rust so take this comment as you will)), but I think what's missing is how they will ensure both compatibility and security / edge case handling, which requires deep knowledge and experience in the original code and "tribal knowledge" of deep *nix internals.
I do wonder whether people got down the article enough to see the list of bugs patched in GNU coreutils.
That "perfectly good code" that it sounds like no one should question included "split --line-bytes has a user controlled heap buffer overflow".
Yes, perfectly good code can have bugs. This is ridiculous thinking to scrap a codebase because it's not bug-free, to replace it with one riddled with differences in behavior that break everything that uses it.
The irony here being that GNU's coreutils themselves originated as rewrites, from back when BSD's copyright status was still legally unclear.
Understandable as GNU was founded on software freedom. I guess one could argue that the Rust rewrite is to establish some kind of higher standard for correctness.
Memory safety catches buffer overflows. CI catches logic bugs. Neither catches the Unix API gotchas nobody documented.
LLM account
CI catches all kinds of bugs.
How does CI catch logic bugs?
That depends on what tests you are running. In any significant projects you need a test suite so large that you wouldn't run all the tests before pushing to CI - instead you are the targeted tests that test the area of code you changed, but there are more "integration tests" that go through you code and thus could break, but you don't actually run.
You can also run some static analysis that is too long to run locally every time, but once in a while it will point out "this code pattern is legal buy is almost always a bug"
It is also possible to do some formal analysis of code on CI that you wouldn't always run locally - I'm not an expert on these.
That's true in general. In this case where the logic bugs are from not understanding the API being implemented (and in any similar case), tests wouldn't catch the bugs either (even integration tests) because good tests require understanding the contract of the unit being tested.
They're not API gotchas in most cases.
And writing comprehensive tests for this behaviour is very difficult regardless of which language you are using.
I am all for rust rewrites of things. But in this case, these are mistakes which were encouraged by the lazy design of `std::fs` and the developers' lack of relevant experience.
And to clarify, I don't blame the developers for lacking the relevant experience. Working on such a project is precisely the right place to learn stuff like this.
I think it's an absurdly dumb move by Canonical to take this project and beta-test it on normal users' machines though…
Reading that Canonical thread was jaw-dropping. Paraphrased: "Rust is more secure, security is our priority, therefore deploying this full-rewrite of core utils is an emergency. If things break that's fine, we'll fix it :)".
I would not want to run any code on my machines made by people who think like this. And I'm pro-Rust. Rust is only "more secure" all else being equal. But all else is not equal.
A rewrite necessarily has orders of magnitude more bugs and vulnerabilities than a decades-old well-maintained codebase, so the security argument was only valid for a long-term transition, not a rushed one. And the people downplaying user impact post-rollout, arguing that "this is how we'll surface bugs", and "the old coreutils didn't have proper test cases anyway" are so irresponsible. Users are not lab rats. Maintainers have a moral responsibility to not harm users' systems' reliability (I know that's a minority opinion these days). Their reasoning was flawed, and their values were wrong.
Agree with the point. Asking sincerely, how to filter out installing any rust-rewrite packages on my machines? Does anyone know the way?
If you don't want Canonical's packages, you should probably just be using Debian rather than Ubuntu. It's not 2008 anymore, stock Debian is quite user-friendly.
Worth noting is that in Debian experimental coreutils defaults to coreutils-from-uutils [0]. This came as a big surprise and as far as I can tell there's been no discussion. A Canonical developer seems to have unilaterally overwritten the coreutils package without discussing with the maintainer. All the package renames that are in Ubuntu aren't in Debian so you can't switch to GNU utils either without deep trickery in a separate recovery environment.
I'm used to running experimental software but I wasn't ready for my computer to not boot one day because of uutils. The `-Z` flag for `cp` wasn't implemented in the 9 month old version shipped in Debian at that time so initramfs creation failed...
[0] https://packages.debian.org/experimental/coreutils
that... seems newsworthy on its own merit.
It's in experimental only, not unstable or testing. That said I'm surprised it hasn't even propmpted discussion on debian-devel (sans [0]). I would've thought that at least enough Debian developers run experimental to have noticed and raise the issue, but no. I thought about starting a thread myself but couldn't be bothered.
[0] https://lists.debian.org/debian-devel/2026/04/msg00004.html
Considering how Ubuntu seems to influence Debian development, this is only slightly surprising.
See: https://lists.debian.org/deity/2025/10/msg00071.html - Hard Rust requirements from May onward - by a Core Ubuntu Developer
Or use a sane distribution like Arch or Gentoo instead of Ubuntu based systems.
Alpine Linux has a better shot at acceptable compile times.
Some FOSS software seems to maximize kernel IO last time I had a Gentoo.
Or Fedora.
I feel like Fedora has the same pragmatic approach (allows non-free drivers, packages, etc.) and is just as easy to use.
I'm unaware of any Rust rewrites outside of coreutils, so:
https://computingforgeeks.com/ubuntu-2604-rust-coreutils-gui...
There aren't true 1:1 clones, but there's ripgrep (inspired by GNU grep) and fd (inspired by GNU find). Those two I like, though. I think they're thoughtfully designed and in ripgrep's case at least (I just haven't read posts/comments by fd's author), it was developed with some close study of other grep implementations. I still use GNU grep and GNU find as well, but rg and fd are often nice for me.
The other nice thing about rg and fd is that they work natively on Windows.
This leaves such a bad taste in my mouth. If you fucking found 44 CVEs with some relatively amateurish ones (I'm no security engineer but even I've done that exact TOCTOU mitigation before) in such a core component of your system a month before 26.04 LTS release (or a couple months if you count from their round 1), surely the response should be "we need to delay this to 28.04 LTS to give it time to mature", not "we'll ship this thing in LTS anyway but leave out the most obviously problematic parts"?
The snap BS wasn't enough to move me since I was largely unaffected once stripping it out, but this might finally convince me to ditch.
Ubuntu has been doing careless shit like that their entire existence, it's nothing new
It's insane that this is going into an LTS. It's the kind of experiment I'd expect them to play with in a non-LTS and revert in LTSes until it's fully usable, like they did with Wayland being the default, which started in 2017
This is a people problem and Canonical just isn't good at hiring people
I’ve gotta agree. Some horror stories were going around about their interview process. It seemed highly optimized to select people willing to put up with insane top-down BS.
> They knew how to write Rust, but clearly weren't sufficiently experienced with Unix APIs, semantics, and pitfalls.
The point of Rust is that you shouldn't have to worry about the biggest, easiest to fall in pitfalls.
I think the author's point of this article, is that a proper file system API should do the same.
Having panics in these are pretty amateur hour even just on a Rust level. I could see if they were like alloc errors which you can't handle, but expect and unwraps are inexcusable unless you are very carefully guarding them with invariants that prevent that code path from ever running.
One thing that's hard about rewriting code is that the original code was transformed incrementally over time in response to real world issues only found in production.
The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
TFA is a good list of this exact sort of thing.
Before you call people amateur for it, also consider it's one of the most softwarey things about writing software. It was bound to happen unless coreutils had really good technical docs and included tests for these cases that they ignored.
What's even harder is doing that while trying to avoid the GPL, so doing that without reading the original source code.
uutils would be so much better imo if it was GPL and took direct inspiration from the coreutils source code.
The GPL prevents you from reading the licensed code before writing related non-GPL code? Which section of the GPL says that?
It's based on an interpretation of "derived from".
It does not matter if it's in the GPL explicitly or not since we're talking about uutils and their stance on it, and they've written that:
https://github.com/uutils/coreutils/blob/6b8a5a15b4f077f8609...
> we cannot accept any changes based on the GNU source code [..]. It is however possible to look at other implementations under a BSD or MIT license like Apple's implementation or OpenBSD.
The wording of that clearly implies that you should not look at GNU source code in order to contribute to uutils.
"clearly implies"
Hmmmm....
"we cannot accept any changes based on the GNU source code" is false. They are choosing not to accept it.
"We cannot accept it without issuing a breaking change to the project by significantly changing the license terms."
This is clean room implementation 101, and why LLMs are so controversial in terms of licensing.
good example from the article: the chroot+nss CVE. the rule that nss is dynamic and dlopens libraries from inside the chroot isn't anywhere obvious. it's encoded in 25+ years of sysadmins finding it out. clean-room rewrites end up re-learning that, usually as new CVEs. and LLM ports of the same code inherit the problem: the function signature is what they read, but the scars are what they need.
> the function signature is what they read, but the scars are what they need.
This feels like a golden quote. Don't know if you intended for it to rhyme, but well done :D
thanks. honestly didn't catch the rhyme, accidental aphorism :D
> The code gets silently encumbered with those lessons, and unless they are documented, there's a lot of hidden work that needs to be done before you actually reach parity.
It should be stressed that failure to document such lessons, or at least the bugs/vulnerabilities avoided, is poor practice. Of course one can't document the bugs/vulnerabilities one has avoided implicitly by writing decent code to begin with, but it is important to share these lessons with the future reader, even if that means "wasting" time and space on a bunch of documentation such as "In here we do foo instead of bar because when we did bar in conditions ABC then baz happens which is bad because XYZ."
I struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils. The whole coreutils rewrite is a terrible idea[1] and clearly being done in the wrong way (without the knowledge gained from the previous software).
If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.
To be clear; I love Rust, I use it for various projects, and it's great. It doesn't save you from bad engineering.
[1]: https://www.joelonsoftware.com/2000/04/06/things-you-should-...
I expect nothing less from the creators of unity, upstart, and snap.
> I struggle to find anything on this post that wouldn't be caught by some kind of unit test or manual review, especially when comparing with the GNU source for the coreutils.
> If you do a rewrite, you should fully understand and learn from the predecessor, otherwise youre bound to repeat all the mistakes. Embarassing.
Interestingly, the uutils project uses the GNU coreutils test suite.
EDITED to add: they also have a stated position of not allowing contributions based on reading the GPL'd source.
welcome new systems programmers: unix is broken and you must write ugly non-pedagogical workarounds and do empirical testing. this is what reliable software and good software engineering actually is... surprise!@#%
> The pattern is always the same. You do one syscall to check something about a path, then another syscall to act on the same path. Between those two calls, an attacker with write access to a parent directory can swap the path component for a symbolic link. The kernel re-resolves the path from scratch on the second call, and the privileged action lands on the attacker’s chosen target.
It's actually even worse than that somewhat, because the attacker with write access to a parent directory can mess with hard links as well... sure, it only messes with the regular files themselves but there is basically no mitigations. See e.g. [0] and other posts on the site.
[0] https://michael.orlitzky.com/articles/posix_hardlink_heartac...
hmm... maybe a 'write lock' on the directory? though this will become more hairy without timeouts/etc...
To the extent that locking exists in posix it is various degrees of useless and broken. And as far as I know while BSDs have extensions which make some use cases workable Linux is completely hopeless.
The root cause of some of the bugs seems to be the opaque nature of some of the Unix API. E.g.
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username. An attacker who can plant a file in the chroot gets to run code as uid 0.
To me such a get_user_by_name function is like a booby trap, an accident that is waiting to happen. You need to have user data, you have this get_user_by_name function, and then it goes and starts loading shared libraries. This smells like mixing of concerns to me. I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
> The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
Seems and smells is weasel words. The root cause is not thinking: Why is root chrooting into a directory they do not control?
Whatever you chroot into is under control of whoever made that chroot, and if you cannot understand this you have no business using chroot()
> To me such a get_user_by_name function is like a booby trap
> I'd say, either split getting the user data and loading any shared libraries in two separate functions, or somehow make it clear in the function name what it is doing.
You'd probably still be in the trap: there's usually very little difference between writing to newroot/etc/passwd and newroot/usr/lib/x86_64-linux-gnu/libnss_compat.so or newroot/bin/sh or anything else.
So I think there's no reason for /usr/sbin/chroot look up the user id in the first place (toybox chroot doesn't!), so I think the bug was doing anything at all.
> The root cause is not thinking: Why is root chrooting into a directory they do not control?
Because you can't call chroot(2) unless you're root. And "control a directory" is weasel words; root technically controls everything in one sense of the word. It can also gain full control (in a slightly different sense of the word) over a directory: kill every single process that's owned by the owner of that directory, then don't setuid into that user in this process and in any other process that the root currently executes, or will execute, until you're done with this directory. But that's just not useful for actual use, isn't it?
Secure things should be simple to do, and potentially unsafe things should be possible.
> And "control a directory" is weasel words;
I did not choose the term to confuse you, that's from the definition document linked to the CVE:
https://cwe.mitre.org/data/definitions/426.html
The CVE itself uses the language "If the NEWROOT is writable by an attacker" which could refer to a shared library (as indicated in the report), or even a passwd file as would have been true since the origin of chroot()
> root technically controls everything in one sense of the word.
But not the sense we're talking about.
> Because you can't call chroot(2) unless you're root
Well you can[1], but this is /usr/sbin/chroot aka chroot(8) when used with a non-numeric --userspec, and the point is to drop root to a user that root controls with setuid(2). Something needs to map user names to the numeric userids that setuid(2) uses, and that something is typically the NSS database.
Now: Which database should be used to map a username to a userid?
- The one from before the chroot(2)?
- Or the one that you're chroot(2)ing into
If you're the author of the code in-question, you chose the latter, and that is totally obvious to anyone who can read because that's the order the code appears in, but it's also obvious that only the </i>first one* is under control of root, and so only the first one could be correct.
[1]: if you're curious: unshare(CLONE_USERNS|CLONE_FS) can be used. this is part of how rootless containers work.
> Well you can[1],
No, you can't, it's an entirely different syscall that does something vaguely similar. IMHO there are a bit too many root-restricted operations that should not have been; but they are, so we're stuck with setuid-enabled "confused deputies" — arguably, it's the root that should be prohibited from calling chroot(2).
> Now: Which database should be used to map a username to a userid? If you're the author of the code in-question, you chose the latter
That's the problem: the choice is implicit. If the author moved setuid/setgid calls way up in the call order, the implicit choice would've also been the safe one but it was literally impossible.
> unshare(CLONE_USERNS|CLONE_FS) can be used
Wait, CLONE_USERNS? That's not a real flag. Did you mean CLONE_NEWUSER?
> Did you mean CLONE_NEWUSER? [~] it's an entirely different syscall that does something vaguely similar
Yes. And I agree, but it also enables chroot(2) to work without being root, which was the syscall we are talking about, and which I still maintain is not as important as reading.
> arguably, it's the root that should be prohibited from calling chroot(2).
> IMHO there are a bit too many root-restricted operations that should not have been
It's a popular opinion. It's also cheap. So what?
> so we're stuck with setuid-enabled "confused deputies"
chroot(8) is not setuid-enabled. This has nothing to do with anything.
> That's the problem: the choice is implicit. If the author moved setuid/setgid calls way up in the call order, the implicit choice would've also been the safe one but it was literally impossible.
False. The setuid/setgid calls are in the right place. The lookup of the database mapping usernames to userids is in the wrong place.
If the rust programmer just read what they wrote they would see this.
If you just read what they wrote you would see this.
Yes thats one thing Musl libc removes.
If the attacker can control newroot/etc/passwd they _still_ get getpwnam to return whatever userid they want. The solution is to not lookup --userspec=username:group inside the chrooted-space, but from outside.
Also, hi how's things? :)
hi! good, how are you doing?
great. still enjoying the algarve working on my secret projects in the sun.
you able to find a reason to come visit? or am i going to have to come to blighty so we can hang out?
> The root cause of some of the bugs seems to be the opaque nature of some of the Unix API.
Some, maybe, but if you've decided to rewrite coreutils from scratch, understanding the POSIX APIs is literally your entire job.
And in any case, their test for whether a path was pointing to the fs root was `file == Path::new("/")`. That's not an API problem, the problem is that whoever wrote that is uniquely unqualified to be working on this project.
Interestingly, it looks like the `file == Path::new("/")` bit was basically unchanged from when it was introduced... 12 (!) years ago [0] (though back then it was `filename == "/"`). The change from comparing a filename to a path was part of a change made 8 months ago to handle non-UTF-8 filenames.
> That's not an API problem, the problem is that whoever wrote that is uniquely unqualified to be working on this project.
To be fair, uutils started out with far smaller ambitions. It was originally intended to be a way to learn Rust.
[0]: https://github.com/uutils/coreutils/commit/7abc6c007af75504f...
> Some, maybe, but if you've decided to rewrite coreutils from scratch, understanding the POSIX APIs is literally your entire job.
Yes, it is. But still such traps in API just unacceptable. If you design API that requires obscure knowledge to do it right, and if you do it wrong you'll get privilege escalation, it is just... just... I have no words for it. It is beyond stupidity. You are just making sure that your system will get these privilege escalations, and not just once, but multiple times.
No one is under any impression (or should be) that the POSIX API isn't old and legacy. That's not why we still use it.
Rather, I think that using a functional safe language tricks people into thinking that the data it deals with is stateless. Whereas many many things change in operating systems all the time.
Until we have a filesystem that can present a snapshot, everything has to checked all the time.
i.e. we need an API which gives input -> good result or failure. Not input -> good result or failure or error.
Unix and POSIX are fractally a booby trap.
Ok if there were some rust guys rewriting coreutils with no experience in linux, but how come Ubuntu accepted it into its mainline?
Because it's Ubuntu policy to replace some foundational part of the system with some janky unfinished experiment in every release.
I agree with you that that's more the story here than "OMG, somebody wrote Rust code with bugs in it".
Right? Canonical wanted (still wants?) to use a coreutils implementation where "rm ./" would print "invalid input" while silently deleting the directory anyway.
I don't really care that some very amateur enthusiasts wrote some bad code for fun, but how in the world did anyone who knows anything about linux take this seriously as a coreutils replacement?
The original is GPL licensed, while the rewrite is MIT.
Was at actually so important to rush with the switch?
I'm totally fine with people experimenting and making amateur attempts at what adult people do. After all, that's how we grow. What I'm actually curious about is how the decision-making chain at Ubuntu got so messed up that this made it into production.
Sometimes growing is only your height increasing
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
Alternatively: if the original utils _did_ have tests, and there were this many holes in the tests, then maybe there is a massive lack in the original utils test suite?
My understanding is the uutils development process involved extensive testing against the behaviour of the original utilities, including preserving bugs.
But we still have CVE's for trivial things? I mean just a medium sized test suite for "rm" alone should probably be many thousand test cases or so. And you'd think that deleting "." and "./" respectively would be among them? Hindsight is always 20/20 and for inputs involving text input you can never be entirely covered, but still....
If something as basic as "rm ./" is broken, the word "extensive" does not apply to whatever testing there was.
> So does this mean that neither did the original utils have any test harness, the process of rewriting them didn't start by creating one either?
Yes.
> Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected (Such as not deleting the current directory)?
I think people have been trying that since before I was born and haven't yet been successful, so I am much less sure than you are.
For example: How do you decide how many `/` characters to try?
For a better one: Can you imagine if "rm" could simply decide to refuse to delete files containing "important" as first 9 bytes? How would you think of a test for something like that without knowing the letters in that order? What if the magic word wasn't in a dictionary?
> This doesn't seem like sloppy coding, nor a critique of the language, it's just the same old "Oh, this is systems programming, we don't do tests"?
I've never heard anyone say that except as a straw man.
I've heard people say tests don't do what people think they do.
> Sure there are many edge cases, but surely the OS and FS can just be abstracted away and you can verify that "rm .//" actually ends up doing what is expected ?
This is one reason why Windows disables symlinks by default, and it's not an abstraction but wholesale removal of a feature. Unixes can't do that without breaking decades of software that relies on their existence.
MacOS does something similar, for example the chroot() bug isn't an issue in practice because MacOS forbids chroot() by default (you need to disable system integrity protection).
The fundamental problem is caused by the POSIX APIs. They have sharp edges by their very nature. The "fix" is to remove them.
To be fair these are mostly gotchas with Linux and not Rust itself, but I guess the std in Rust could handle some of these issues, in that a std should not allow you to shoot yourself in the foot by default.
That’s a great article, and indeed a very good blog. Just spent ages reading lots of their other articles.
Of the bugs mentioned I think the most unforgivable one is the lossy UTF conversion. The mind boggles at that one!
> These are noisy in test code where panicking on bad data is exactly what you want. The cleanest way to scope them to non-test code is to put #![cfg_attr(test, allow(clippy::unwrap_used, clippy::expect_used, clippy::panic, clippy::indexing_slicing, clippy::arithmetic_side_effects))] at the top of each crate root, or to gate #[allow(...)] on the individual #[cfg(test)] modules.
Surely there's a better way.
Clippy doesn't even run on unit tests by default. Honestly it doesn't seem very useful to have it do so for ordinary development, but maybe you'd want to run Clippy on your unit tests in CI just to be extra safe, in which case you could encode those allowed lints in the line of your CI config where you run `cargo clippy`, e.g. `cargo clippy -- -A unwrap_used -A expect_used -A panic -A indexing_slicing -A arithmetic_side_effects`, if you really didn't want to have them in the source for whatever reason.
Delaying the run of clippy until CI would be annoying, because then you'd get a build failure for something that was preventable and could have been quickly addresses during development before pushing. Just feels like a pebble in your shoe.
I have to partially disagree with applying Hyrum's law here. In the case of core utils, there's not just the common GNU version. There's also what POSIX says they should do and what the various BSD does, plus some other implementations from various vendors that we mostly forget about. If in any case what this version of Core Utils does is different from what GNU does in a way that others are also different, it would be a good thing to break behavior because anyone's script already is wrong in ways that are going to matter in the real world and it may matter in the future anyway, so breaking them now is good. If your script depends on GNU's behavior, then you shouldn't be calling the standard version. You should be explicitly specifying the GNU version. That is, don't use CP. Use GNU-CP or whatever it is commonly installed at. Or you check for what version of CP you have.
But if you seek to replace coreutils (as at least is the case with Canonical it seems), rather than just be another POSIX userland implementation (e.g. busybox), then I would suggest you do need to be bug-compatible? I can apt/dnf/apk install busybox and use that for my user rather than coreutils, but given a significant amount of Linux infrastructure (including likely many personal scripts) are tied to coreutils, the bar is much higher. Given the numerous issues with quality Canonical has had, not just with Ubuntu but their other "commercial" tooling, I'm not sure any rewrite/port, written in rust or otherwise, with Canonical developing, managing, or even being associated with the project can meet the requisite bar.
As someone who prefers BSD I would make it my goal to become something reasonably popular on linux that isn't different just to force less reliance of the GNUisms in their core utils. Nothing wrong with the GNUisms on the command line, but there are are a lot of GNU assumptions in scripts that should be portable.
Thanks for the list. I like these lists, so I can put them into a .md file, then launch "one agent per file" on my codebase and see if they can find anything similar to the mentioned CVEs.
Rust won't catch it, but now the agents will.
Edit: https://gist.github.com/fschutt/cc585703d52a9e1da8a06f9ef93c... for anyone who needs copying this
Most (if not all) of these issues do not matter at all outside the scope GNU utils run in.
For example, using filepaths instead of FDs does not matter in most cases in controlled server environments, or in processes that will never run with elevated privilege (most apps).
> Most (if not all) of these issues do not matter at all outside the scope GNU utils run in.
I suspect that attitude is how we got ourselves into this mess.
You have to assume you ultimately don't control what scope your software runs in. Obviously you do, 99.999% of the time. The other 0.0001% is when someone has found another vulnerability that lets them run your program with elevated privileges in an environment you didn't expect, and then they can use it to exploit one of these bugs. Almost all exploits use a chain of vulnerabilities each one seemingly mostly harmless - your "no one can ever exploit this weakness in my program because I control the environment" will be just one step in the chain.
That sounds far fetched. It is far fetched in the sense that it almost never happens. But nonetheless systems were and are exploited because of it. Once the solution was added in 2006 (openat() and friends), it should have never happened again. And indeed in the GNU utils it can't.
The people who build Rust's std::fs should have been aware of the problem and its solution because it was written in 2015. std::path was written at the same time, and that is where the change has to be made. It's not a big change either: std::path has to translate the path into a OS descriptor use that instead of the path - but only if it was available. I suspect the real issue was they had the same attitude as you, they thought it affects such a small percentage of programs it didn't really matter. That and it's a little bit of extra work.
It was a pity they had that attitude, because the extra work would have avoided this mess.
> The trap is that get_user_by_name ends up loading shared libraries from the new root filesystem to resolve the username.
That's kind of horrifying. Is there a reliable list somewhere of all the functions that do that? Is that list considered stable?
Nope! But basically, expect anything that resolves usernames, or host names, to be done in the userspace by NSS.
It's by design, you see.
This is precisely why I don't link with glibc anymore.
musl has its own approach to this, it's called nscd
It would have avoided the "running code as root" part, but it would still allow an attacker to control the result of the function call.
I mean, the problem being solved here isn't exactly a bad problem to try to solve. You either permanently hard-code `/etc/passwd` as the user database, and `/etc/resolv.conf` as the source of DNS server information, or you allow these to be handled in a more complex way (thus allowing YellowPages, LDAP, or whatever you can imagine).
Obviously, if you tie the ability to handle those things to your filesystem layout, either by loading dynamic libraries from whatever is /usr/lib, or by reading /etc/whatever.conf, or even providing a whole virtual mount à la /proc, chroot'ing gives you both with the ability to override the system-wide policy for yourself (pretty reasonable for DNS lookups, kinda dubious for username lookups) and the opportunity to accidentally pwn yourself.
Frankly, sometimes I feel that on Linux, root should be restricted to executing/loading only a whitelist of executables/shared objects, identified by hash of the contents, not the file paths. But then again, you'll need a allow_for_root(1) utility to maintain this whitelist, and people absolutely will call it in their setup scripts in all kinds of dubious manner.
So it's basically failing on - necessary atomicity for filesystem operation - annoying path & string encoding - inertia for historical behaviors
I'm comfortable saying that "annoying path & string encoding" is encompassed by "inertia for historical behaviors". :P
The "kill -1" is hilarious. I wouldn't use ubuntu for production for quite awhile while things shake out or, probably, never (since i don't use ubuntu).
Why differential fuzzing did not catch these bugs?
https://github.com/uutils/coreutils/tree/main/fuzz/uufuzz
Looks like it doesn't really fuzz much.
https://github.com/uutils/coreutils/tree/main/fuzz/fuzz_targ...
Maybe these tests aren't even fuzz tests?
https://github.com/uutils/coreutils/blob/main/fuzz/fuzz_targ...
Even the tests that look ok are not that good in my opinion because there is no structure to it:
https://github.com/uutils/coreutils/blob/main/fuzz/fuzz_targ...
It should also try to generate mostly correct but slightly wrong things instead of just dumping random data into it.
Seems to also not expect some fuzz tests to even pass in the CI:
https://github.com/uutils/coreutils/blob/a07879b8ab2bb8fe5e0...
Unrelated but also in the category of bugs Rust won't catch (natively), there are crates that allow C++ style contracts, or more generally, dependent typing and can be used to catch issues at compile time rather than runtime. I use this one, anodized.
https://docs.rs/anodized/latest/anodized/
What do you think about the mental load and ergonomics this brings into the code? Also compilation time increase?
> The Python one-liner is there because most modern shells refuse to create a non-UTF-8 filename for you.
Both `echo -ne 'weird\xffname\0' > list0` and `printf 'weird\xffname\0' > list0` seem to work fine for me on Linux. Is this macOS-specific?
> Both `echo -ne 'weird\xffname\0' > list0` and `printf 'weird\xffname\0' > list0` seem to work fine for me on Linux. Is this macOS-specific?
Neither of those create a non-UTF-8 filename. (Both files are named "list0", which is valid UTF-8.) They have non-UTF-8 content, but that's not weird.
But it's not too hard to get a non-UTF-8 filename:
Both zsh & bash support that syntax.
(You could also use process substitution with printf, but that's more steps than necessary. So, something closer to your example would be,
You can't put a \0 in the filename, as there's no way to pass that string in C.)
Note:
TOCTOU means "Time-of-check to time-of-use"
See also: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
I wonder if Rust becomes more popular with AI as Rust can help catch what AI misses, but then if that's the case then what about Haskell, or Lean, or?
The way Haskell handles memory is weird and can be unpredictable.
For core system functionality maybe. But for most applications Rust slow compiler iteration speed becomes a bottleneck when the likes of TypeScript (with Bun) and Go have sub second iteration times.
Plus AI is also good at catching, in other languages, errors that Rust tooling enforces. Like race conditions, use after free, buffer overflows, lifetimes, etc.
So maybe AI will become to ultimate "rust checker" for any language.
In my experience developing different types of applications in Rust, the claims of a "slow compiler" are overstated. Sub second iteration times are definitely a thing in Rust as well, unless you're adding a new dependency for the first time or building fresh.
Our experiences clearly differ then. And for others as well since it's a common complain.
Countless time I have seen other people complain as well. There are articles about it even. Can't find the YouTube link now but recently a gamedev abandoned Rust due to compilation speed alone because iteration speed was paramount to their creative process.
Handwaving isn't going to make it any better. And thinking Go/TS compilation speed are comparable to Rust is, a handwave and a half to say the least.
Cargo check and friends are subpar for AI because they actually need to run the thing and unit tests for efficient agentic loops.
A single loop might recompile and rerun the application/unit tests enough times that slow compilers like Rust and Scala become detrimental.
I think you could have left it at differing experiences and not gone further saying I'm handwaving anything. That doesn't seem productive.
I'm not saying that Rust compilation time is comparable to Go/TS, I'm saying the blanket claim that Rust iteration speed will be a bottleneck requires context.
I definitely agree with you that it is a complaint that is often repeated online, but that doesn't make it universally true. In my experience it's a claim that is often echoed without proper context.
Particularly in the case of AI Rust recompliation times in my experience have not been the dominant cost, but are instead overshadowed by inference time, the agent working through different approaches, etc.
The productivity increase I get overall by not having to worry so much about if my rust code will work if it compiles tends to net faster iteration speeds for me. Compile times have never bothered me.
I think a lower amount of training data for Haskell might be a reason.
> This is the largest cluster of bugs in the audit. It’s also the reason cp, mv, and rm are still GNU in Ubuntu 26.04 LTS. :(
This is what grinds my gears. Why all the hate against GNU?
Honestly, this is why I don't learn Rust, and why I didn't bother to read the rest of the article.
Rust does not hate GNU, and I'm not sure why anyone would have that misconception. It would be like saying that C hates GNU because the BSDs aren't GNU. The fact that there is less GNU-licensed Rust software than MIT-licensed Rust software is attributable to the simple fact that, in general, GNU has been ceding ground to MIT for more than 20 years.
Nor does the parent comment say that "Rust" hates GNU. A language can't hate anything for that matter.
The correct phrasing of the title: The bugs Rust won't catch.
Reversing max and min. That's one I've done a lot, and I don't think any compiler could save me from.
Just use Fedora :)
All the cool kids are using Gentoo or Nix ;)
I enjoyed reading this.
I LOL'd when I read "eternal ball of sadness".
> uutils now runs the upstream GNU coreutils test suite against itself in CI. That’s the right scale of defense for this class of bug. That's the minimum, it is absurd that they did not start from that!
Looks like they've been doing at some kind of automated comparison against the GNU test suite since 2021 or so [0]?
[0]: https://github.com/uutils/coreutils-tracking/commits/main/?a...
I recall the last time there was a massive bug in the uutils project, it was because the coreutils tests didn't cover some crucial aspect people relied on. Running these tests is useful for compatibility and all, but it won't necessarily catch security issues.
I believe they did it all the time. Maybe it was not automated? But they boasted in news multiple times how many coreutils tests they are passing. I suspect that those tests are useless for security, they are more about compatibility or something like that.
The title of this article should be "Rust can't stop you from not giving a fuck" or "Rust can't give a fuck for you."
---
> What’s notable is that all of these bugs landed in a production Rust codebase, written by people who knew what they were doing
...
[List of bugs a diligent person would be mindful of, unix expert or not]
---
Only conclusion I can make is, unfortunately, the people writing these tools are not good software developers, certainly not sufficiently good for this line of work.
For comparison, I am neither a unix neckbeard nor a rust expert, but with the magic of LLMs I am using rust to write a music player. The amount of tokens I've sunk into watching for undesirable panics or dropped errors is pretty substantial. Why? Because I don't want my music player to suck! Simple as that. If you don't think about panics or errors, your software is going to be erratic, unpredictable and confusing.
Now, coreutils isn't my hobby music player, it's fundamental Internet infrastructure! I hate sounding like a Breitbart commenter but it is quite shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure. Wow, honestly pathetic. Sorry to be so negative and for this word choice, but "shock" and "disappointment" are mild terms here for me.
Anyway, thanks for the author of this post! This is a red flag that should be distributed far and wide.
> Pretty shocking to see the lack of basic thought going into writing what is meant to be critical infrastructure
uutils did not start off as "let's make critical infrastructure in Rust", it started off as "coreutils are small and have tests, so we're rewriting them in Rust for fun". As a result there's needed to be a bunch of cleanup work.
Okay, thanks for the context, but aren't distributions eager to adopt these? Are current GNU coreutils a common vulnerability vector?
> For fun
My idea of fun is reviewing my code and making sure I'm handling errors correctly so that my software doesn't suck. Maybe the people who are doing this, for fun, should be more aligned with that mentality?
No, this is only Ubuntu as far as I know because Canonical are idiots.
I love Rust, but I wonder if this is an example of the idea that its excellent type system can lull some people into a false sense of security. Particularly when interfacing to low-level code like kernel APIs, which are basically minefields inadvertently designed to trick the unwary, the Rust guarantees are undermined. The extent of this may not be immediately obvious to everyone.
This seems to be the case, yes. Before reading this post I was a lot more open minded about the "rewrite it in Rust" scene but now I'm just kind of in a horrorpit wondering whether I'll be stuck on macOS forever :(.
Creative but implausible excuse. MacOS is a better OS for consumers than Windows. But if you're a developer or other technical person, nothing stops you from using Linux today.
Right but coming from macOS, how do I know that the Linux distro I pick doesn't have this god-forsaken stuff in it? Before this thread I didn't know Canonical was so... busted. What else do I not know? With macOS, I think I can be sure that this kind of stuff won't be in the core shell commands :).
When I do `man builtin` on macOS now, I get:
``` HISTORY The builtin manual page first appeared in FreeBSD 3.4. ```
which is what I expected, and I don't expect those to be pulled out from under me and replaced with the sort of nonsense we have here today.
I don't think that is the case. I think the people that wrote this are simply bad programmers. Some of these issues are so obvious that if you've been doing any amount of programming, you should be able to anticipate them, whether you're writing C, Rust, or Java.
So yeah, their implementation of chmod checked if a path was pointing to the root of the filesystem with 'if file == Path::new("/")'.
How the f** did this sub-amateur slop end up in a big-name linux distribution? We've de-professionalized software engineering to such a degree that people don't even know what baseline competent software looks like anymore
rust promised you memory safety and delivered - but turns out the filesystem doesn't care about your borrow checker, and these 44 cves are the receipt
> Rust’s standard library makes this easy to get wrong. The ergonomic APIs you reach for first (fs::metadata, File::create, fs::remove_file, fs::set_permissions) all take a path and re-resolve it every time, rather than taking a file descriptor and operating relative to that. That’s fine for a normal program, but if you’re writing a privileged tool that needs to be secure against local attackers, you have to be careful.
It's not fine even for a normal program, because operations on a large number of files will end up an order of magnitude slower. No matter what language you write your utility in.
... reads the article to the end, marvels at all the problems resulting from not understanding how the OS works and missing 40 years of refinement ...
Is this in an Ubuntu LTS ?!?
Seems like typical pattern of
* Let's rewrite thing in X, it is better
* Let's not look at existing code, X is better so writing it from scratch will look nicer
* Whoops, existing code was written like this for a reason
* Whoops, we re-introduce decade+ old problems that original already fixed at some point
I call it FOTM engineering. Let's throw everything out of the window so we can use X novel thing!
> That means, even if the tools were (and probably still are) buggy, they never had a bug that could be exploited to read arbitrary memory.
Well, that begs the question, is it worse to read arbitrary memory (which would probably in most cases be prevented by various dynamic protections [0] anyway), or failing to prevent rm -rf /./ and killing every process in the system, etc.?
This is still a good case study of the value of the much-touted rust rewrites. Usually they are performed by people who are domain experts in rust, but (as seen here) lack basic domain knowledge of the tool's environment.
[0] https://en.wikipedia.org/wiki/Buffer_overflow_protection
TIL that
> uutils read it as “send the default signal to PID -1”, which on Linux means every process you can see.
What's the use case for killing all process you can see?
kill -SIGWINCH -1 will redraw all your windows.
Many cases, including as a last resort as part of shutdown, to try to trigger remaining services into a graceful exit (although these days cgroups help avoid ever being in such a situation).
I know nobody's perfect and I'm not asking for perfection, but these bugs are pretty alarming? It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they are trying to replace. Or at least didn't have any curiosity about why the GNU tools work the way they do. Otherwise they might've wondered about why things operate on bytes and file descriptors instead of strings and paths.
I hate to armchair general, but I clicked on this article expecting subtle race conditions or tricky ambiguous corners of the POSIX standard, and instead found that it seems to be amateur hour in uutils.
> It seems like these supposed coreutils replacements are being written by people who don't know anything about Unix, and also didn't even bother looking at the GNU tools they were supposed to be replacing.
They're a group of people who want to replace pro-user software (GPL) with pro-business software (MIT).
I don't really want them to achieve their goal.
They are deliberately not looking at coreutils code because the Rust versions are released as MIT and they don't want the project contaminated by GPL. I am not fond of this, personally.
Few things to note
1. uutils as a project started back in 2013 as a way to learn Rust, by no means by knowledgeable developers or in a mature language
2. uutils didn't even have a consideration to become a replacement of GNU Coreutils until.... roughly 2021, I think? 2021 is when they started running compliance/compatibility tests, anyway
3. The choice of licensing (made in 2013) effectively forbids them from looking at the original source
I find it interesting how people will criticise Rust for not preventing all bugs, when the alternative languages don't prevent those same bugs nor the bugs rust does catch. If you're comparing Rust to a perfect language that doesn't exist, you should probably also compare your alternative to that perfect language as well right?
I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime, and compare it with this rewrite. Same with the number of memory bugs that are impossible in (safe) Rust.
Don't just downvote me, tell me how I'm wrong.
i don't think CVEs were a thing at the start of the GNU rewrite
What's the point of a "rewrite in Rust" when it introduces bugs that either never existed in the original or were fixed already?
> I'd be interested in a comparison with the amount of bugs and CVE's in GNU coreutils at the start of its lifetime
The point is, those bugs had been discovered and fixed decades ago. Do you want to wait decades for coreutils_rs to reach the same robustness? Why do a rewrite when the alternative is to help improve the original which is starting from a much more solid base?
And even when a complete rewrite would make sense, why not do a careful line-by-line porting of the original code instead of doing a clean-room implementation to at least carry over the bugfixes from the original? And why even use the Rust stdlib at all when it contains footguns that are not acceptable for security-critical code?
Idk, you should ask the maintainers these questions, or the Ubuntu maintainers. I'm not particularly arguing in favour of this rewrite, but the title and contents of the post are talking about Rust in general and the type of bugs it can/can't prevent.
Perhaps one good reason is that once the initial bugs are fixed, over time the number of security issues will be lower than the original? If it could reach the same level of stability and robustness in months or a small number of years, the downsides aren't totally obvious. We will have to wait to judge I suppose. Maybe it's not worth it and that's fine, but it doesn't speak to Rust as a language.
The Rust developers have not read the original coreutils, because they want to replace the GPL license, so they want to be able to say that their code is not derived from the original coreutils.
For a project of this kind, this seems a rather stupid choice and it is enough to make hard to trust the rewritten tools.
Even supposing that replacing the GPL license were an acceptable goal, that would make sense only for a library, not for executable applications. For executable applications it makes sense to not want GPL only when you want to extract parts of them and insert them into other programs.
> For executable applications it makes sense to not want GPL only when you want to extract parts of them and insert them into other programs.
It is very common for applications written in Rust to be split in multiple reusable crates. Looking at the main crate, that is the case here too: https://crates.io/crates/coreutils/0.8.0/dependencies
This allows for the learnings of uutils (and by extension GNU coreutils) to be able to be leveraged by any other project that needs the same functionality. I noticed on a quick scan of the dependents on uucore that other projects (like nushell) do so.
> What's the point of a "rewrite in Rust" when it introduces bugs that either never existed in the original or were fixed already?
Because you are trying to remove memory safety as a source of bugs in the future. No code is bug free, but removing entire categories of bugs from a code base is a good thing.
You’re right, but it’s gonna be hard to stop them from raging. In many ways people want to be justified in a „see, I told you so, Rust is useless” belief, and they’re willing to take one or two questionable logical steps to get there.
"The alternative languages" - in this case you're talking about C, 99% of the time.
So let's talk about that. Well written C code, especially for the purpose of writing and continuing to maintain mature GNU coreutils, is not a big risk in terms of CVE. Between having an inexperienced Rust developer and an extremely experienced C developer (who's been through all the motions), I'd say the latter is likely the safer option.
100% it's the safer option.
The software with the best security track record of all time is written in C.
I would maybe not go that far, look at ADA, SPARK etc.
I'm curious which software you have in mind. Ex: seL4 is technically C, but I'd say the theorem prover is doing most of the real work there.
Specifically? I'm thinking of qmail.
qmail was at one point the second most widely deployed email server, handling the majority of online mail. It wasn't a research project; it's not obscure. Yahoo used to use it.
And what I mean by track record: After more than a decade after the last published version, a theoretical attack was found requiring special setup uncommon for a sysadmin, and impossible ten years prior.
When anyone thinks about how to build reliable secure software, I think they should be thinking of qmail because it really has no public source-available equal, except maybe djbdns.
seL4 on the other hand makes some specious claims about some ten year old version of itself, and so few people have even heard about it you thought it important to remind it is "technically" C -- qmail isn't like that at all: There is no prover, no test suite, and almost no metaprogramming of any kind. It's just C.
I would recognize sarcasm when I see it. But statistically, that could be true, considering the amount of C code running ( probably far less than COBOL or FORTRAN ), Compared to the relatively small amount of Rust code vs the amount of faults observed with it.
The software with the worst security track record of all time is also written in C.
What an incredibly dishonest argument. Obviously "Well written C code" won't be riddled with CVE's by definition, the problem is that since programs written in C are littered with CVE's, it turns out it's really really difficult to write well written C, even for the best developers. With Rust, that entire class of problems is eliminated entirely.
> "The alternative languages" - in this case you're talking about C, 99% of the time.
And that's part of the problem. There's no excuse beyond maybe platform support for starting a brand new project in C, when C++ exists.
TL;DR: Rust can't catch logic bugs
This is what happens when many people hype about a technology that solves a specific class of vulnerabilities, but it is not designed to prevent the others such as logic errors because of human / AI error.
Granted, the uutils authors are well experienced in Rust, but it is not enough for a large-scale rewrite like this and you can't assume that it's "secure" because of memory safety.
In this case, this post tells us that Unix itself has thousands of gotchas and re-implementing the coreutils in Rust is not a silver bullet and even the bugs Unix (and even the POSIX standard) has are part of the specification, and can be later to be revealed as vulnerabilities in reality.
> the uutils authors are well experienced in Rust
I'm not sure that they were all that experienced in Rust when most of this code was written. uutils has been a bit of a "good first rust issue" playground for a lot of its existence
Which makes it pretty unsurprising that the authors also weren't all that well versed in the details of low-level POSIX API
It's not designed to completely eliminate other bug classes but it is designed to reduce the chance that they happen.
In this case the filesystem API was perhaps not as well designed as it could have been. That can potentially be fixed though.
Some of the other bugs would be hard to statically prevent though. But nobody ever claimed otherwise.
I feel like one of the takeaways here is that Rust protects your code as long as what your code is doing stays predictably in-process. Touching the filesystem is always ripe with runtime failures that your programming language just can't protect you from. (Or maybe it also suggests the `std::fs` API needs to be reworked to make some of these occurrences, if not impossible, at least harder.)
On a separate note: I have a private "coretools" reimplementation in Zig (not aiming to replace anything, just for fun), and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see. However, cross-checking uutils I noticed it does have a bunch of unsafe blocks that call into libc, e.g. https://github.com/uutils/coreutils/blob/77302dbc87bcc7caf87.... Thankfully they're pretty minimal, but every such block can reduce the safety provided by a Rust rewrite.
> and I'm striving to keep it 100% Zig with no libc calls anywhere. Which may or may not turn out to be possible, we'll see.
Probably will depend on what platform(s) you're targeting and/or your appetite for dealing with breakage. You can avoid libc on Linux due to its stable syscall interface, but that's not necessarily an option on other platforms. macOS, for instance, can and does break syscall compatibility and requires you to go through libSystem instead. Go got bit by this [0]. I want to say something similar applies to Windows as well.
This Unix StackExchange answer [1] says that quite a few other kernels don't promise syscall compatibility either, though you might be able to somewhat get away with it in practice for some of them.
[0]: https://github.com/golang/go/issues/17490
[1]: https://unix.stackexchange.com/a/760657
Since it's a personal project, Linux compatibility is the only thing I care about right now. I'm testing it under WINE as well, just because I can, but I don't have access to Mac OS so I'm skipping that problem entirely for now