points by asveikau 1 day ago

The things you can do between fork and exec are sometimes underestimated. Off the top of my head, you can call dup2(), you can set a process group id, probably a few other things.

If you contrast that with win32, where you optionally pack a bunch of initial values into a struct, win32 is a much more narrow, less pleasant, less freeform interface, where it is harder to introduce more features.

But I think there is already posix_spawn to imitate that philosophy on Unix-like OSs.

dcrazy 1 day ago

posix_spawn is emulated on Linux, but it is a native syscall on macOS (and possibly other OSes?). As discussed in the linked article, there is interest in changing Linux to adopt this model, where posix_spawn is its own fundamental primitive.

  • asveikau 1 day ago

    Yeah, I think it is a reasonable transition path or implementation detail for some systems to implement it in userland atop fork(2), and others to natively spawn a new process without copying the old address space.

loeg 1 day ago

> The things you can do between fork and exec are sometimes underestimated. Off the top of my head, you can call dup2(), you can set a process group id, probably a few other things.

What do you mean underestimated? You can do anything between fork and exec; there are no limitations.

  • dcrazy 1 day ago

    That’s not true. man 7 signal-safety

    • loeg 22 hours ago

      You're talking about libc design choices, not constraints imposed by the kernel. To the kernel, a post-fork pre-exec process is just any old process. GP was suggesting post-fork processes were constrained in the syscalls they could invoke; they are not.

      • asveikau 21 hours ago

        I did not say they are constrained in what syscalls they can make, as if some nanny at the syscall entry point will punish you for doing wrong. I said that it interacts poorly with threads due to inherent race conditions. See the other comment.

        • loeg 21 hours ago

          > I said that it interacts poorly with threads due to inherent race conditions.

          No, you absolutely did not: https://news.ycombinator.com/item?id=48427396

          Literally nothing in that comment mentions or discusses threads.

          > I did not say they are constrained in what syscalls they can make

          You wrote: "The things you can do between fork and exec are sometimes underestimated. Off the top of my head, you can call dup2(), you can set a process group id, probably a few other things."

          Those are all syscalls. You can also invoke any of the other ~hundreds of syscalls linux exposes, not only dup2, setpgid, and a "few" others.

  • asveikau 1 day ago

    That's not true. Just one example, if you do anything with threads you are pretty screwed. For example if another thread holds a mutex at the time of fork(2), and you also want that mutex.

    • loeg 22 hours ago

      You can create threads in forked children before exec. Nothing in the kernel prevents you from invoking clone().

      You're talking about libc (glibc) implementation details now; userspace programs running on the Linux kernel do not have to be implemented in C or use glibc's primitives. Your earlier comment I initially replied to was talking about kernel syscalls. Forked processes are free to invoke any syscall they want, not just dup2 or a handful of others.

      • asveikau 22 hours ago

        I'm not talking about glibc implementation details. I'm talking about how mixing fork(2) with threads creates harmful race conditions.

        The forked child has only 1 thread in its process. If the parent's threads are holding a lock or are in the middle of mutating a shared data structure, you're fucked, because those threads are no longer running in your child's copy of the address space and will not finish their work. This issue is fundamental to how threads work and what fork(2) does.

        • loeg 21 hours ago

          Again, you're talking about userspace now. Not kernel-imposed constraints. A userspace program is always free to deadlock itself; fork doesn't change that.

          • asveikau 21 hours ago

            I never said it was a kernel imposed constraint. It remains unsafe behavior, and frankly you'd be stupid to ignore it if you want to write a stable multi threaded program. In colloquial shorthand, you can't do it.

            Signal safety is not the same as this, but similar. I believe posix specifies what is signal-unsafe to be overly broad. But the unsafety isn't an illusion -- it's an emergent property from something being a bad idea given the primitives at work, there are broad categories of bugs that are easy to introduce due to the way it works. So for signals, posix declares a bunch of ill advised things to be undefined, and with good reason. This is an analogous scenario.

          • asveikau 20 hours ago

            Just want to come back with a simple example.

            This means if the program is multi threaded, you cannot rely on calling malloc in the child, because at the time of the fork another thread could have happened to be inside malloc doing manipulations on the global heap.

            Which means, practically speaking, "don't allocate memory between fork and exec".

            If you want to be overly literal as you have been, you can call mmap and it will give you new pages, but who is really doing that? Not the random shared library code you might want to call into. Hell, even a lot of libc calls malloc.

            Which means it's not safe to do a random library call between fork and exec.

            See where I'm going with this? That's if your program is multi threaded. If it isn't, these things are most likely fine.