Way too many ways to wait on a child process with a timeout

(gaultier.github.io)

111 points | by broken_broken_ 4 days ago

15 comments

  • adrianmonk 1 day ago
    Tenth Approach: fork() two processes.

    Child 1 exec()s the command.

    Child 2 does this:

        signal(SIGALRM, alarm_handler);
        alarm(timeout_length);
        pause();
        exit(0);
    
    Start both children, then call wait(), which blocks until any child exits and returns the pid of the child that exited. If it's the command child, then your command finished. If it's the other child, then the timeout expired.

    Now that one child has exited, kill() the other child with SIGTERM and reap it by calling wait() again.

    All of this assumes you'll only have these two children going, but if you're writing a small exponential backoff command retry utility, that should be OK.

    • GoblinSlayer 2 hours ago
      Shouldn't wait() support SIGALRM directly with EINTR?
  • machine_coffee 4 days ago
    Lol, author's thought process mirrored mine as I read the article, as I was reading I was thinking, 'doesn't kqueue support that?... and then a section on kqueue. Then I was thinking to myself, so how does the Linux implementation do it then?... was just about to start trawling the source code when 'A parenthesis..'

    Great article. Sorry to say though, Windows does manage all this in a more consistent way - but I guess they had the benefit of a clean slate.

    • silon42 3 days ago
      signalfd / process descriptiors are the Windows style mechanism... what is missing are a few things like 'spawn' that returns a fd directly (eliminating races...)
      • blibble 3 days ago
        there is no race from the parent

        the pid will not be reused until you either handle sigchld or wait

    • grungedang 1 day ago
      [dead]
  • AnotherGoodName 1 day ago
    In the early days of android i had an app that had to do video transcoding yet often hit oom on startup (reported via telemetry) even when the phone should have enough memory. This was before android had any video transcoding built in (2.3 days).

    The solution was to spawn a child process, use memory in a loop, catch the sigkill in the parent, yield to the os as it killed other processes to free memory in the device as a whole and then on return from sleep in the parent process after killing the child start the video transcoding.

    Hopefully this hack is not needed but if you want android to proactively run its process killing job so your app starts with maximum free memory the above worked!

    • franga2000 1 day ago
      From what I've heard, this is also how most of the "memory cleaner" apps work on most platforms - use memory in a loop so the system starts dropping various caches and housekeeping tasks and swapping backgroud processes, then exit so the memory is reclaimed.
  • greggyb 1 day ago
    Not so much about timeouts, but related in that it is based around managing children processes:

    The lineage of tools descending from daemontools for service management is worth exploring:

    daemontools: http://cr.yp.to/daemontools.html

    runit: https://smarden.org/runit/

    s6: https://skarnet.org/software/s6/

    dinit: https://davmac.org/projects/dinit/

  • nf3 4 days ago
    FWIW io_uring does have support for waitid.

    https://www.man7.org/linux/man-pages/man3/io_uring_prep_wait...

    • EdSchouten 1 day ago
      An interesting aspect of waitid is that it allows you to access the full exit code of the process (i.e., the entire int instead of just the bottom 8 bits).

      Unfortunately, many operating systems implement waitid() on top of one of the older APIs, meaning the top bits get lost regardless…

    • broken_broken_ 4 days ago
      Many thanks! I have added it to the article in due form now.
  • cdaringe 16 hours ago
    I wrote a crate https://crates.io/crates/swaperooni for similar use cases some time ago. I only gave the article a cursory scan, and can clearly see much deeper thought given here. Can't wait to dig in after work and learn a little bit.

    Dunking on my crate is welcomed :)

  • nasretdinov 4 days ago
    So many ways and no-one mentioned threads..?

    Edit: by threads I mean creating a new thread to wait for the process, and then kill the process after a certain timeout if the process hasn't terminated. I guess I'm spoiled by Go...

    • zbentley 3 days ago
      The threading approach is roughly:

      1. Start a thread

      2. That thread starts a child process and signals "started" by storing its PID somewhere globally-visible (and hopefully atomic/lock-protected).

      3. The thread then blocks in wait(2), taking advantage of its non-main-thread-ness to avoid some signals and optionally masking/ignoring some more.

      4. When the process exits, the thread can write exitstatus/"completed" to the globally-visible state next to PID. The thread then exits.

      3. External observers wait for the process with a timeout by attempting to join the thread with a timeout. If the timeout occurs, they can access the globally-visible PID and send a signal to it.

      This is missing from the article (EDIT: it has since been added, thanks!). That doesn't mean it's a good solution on many platforms. It's more costly in resources (thread stack), more code than most of the listed options, vulnerable to PID-reuse problems that can cause a killsignal to go to the wrong process, likely plays poorly with spawning methods that request a SIGCHLD be sent to the parent on exit (and plays poorly with signals in general if any customization is needed there), and is probably often slower than most of TFA's alternatives as well, both due to syscall count and pessimal thread/scheduler switching conditions. Additionally, it multiplexes/composes to large numbers of processes poorly and with a high resource cost.

      EDIT: Golang's version of this is less bad than described above, but not perfect. Go's spawning infrastructure mitigates resource cost (goroutines/segmented stacks are not as heavy as threads), is vulnerable to PID-reuse (as are most platforms' operations in this area), addresses the SIGCHLD risk through the runtime and signal channels, and mitigates slowness with a very good scheduler. For multiplexing, I would assume (but I have not verified) that the Go runtime is internally using pidfds/kqueue where supported. Where not supported, I would assume Go is internally tracking spawn requests through its stdlib, handling SIGCHLD, and has a single global routine calling wait(2) without a specific PID, waking goroutines waiting on a watched PID when it comes out of the call to wait(2).

      • nasretdinov 3 days ago
        Thanks. I believe that Go indeed _could_ use those APIs to wait for the child more efficiently if they chose to, but the current implementation suggests that they're just calling wait4() in a separate thread: https://cs.opensource.google/go/go/+/refs/tags/go1.23.3:src/...

        To be fair, in Go process spawning is very inefficient to begin with, since it requires lots of runtime coordination to not mess with the threads/goroutines state during fork, so running wait4() in a separate thread (although the thread can be re-used afterwards) is not the biggest concern here.

      • broken_broken_ 3 days ago
        Thanks for the suggestion, I have added a short section about threads.
  • xchip 4 days ago
    Thanks for this great article, it is going to be very useful for my project. I am currently developing an open source Android native app that invokes rsync when a file gets closed (ie: you take a picture)

    https://github.com/aguaviva/Syncy

  • akira2501 1 day ago
    > I would prefer extending poll to support things other than file descriptors, instead of converting everything a file descriptor to be able to use poll.

    Why? The ability to block on these descriptors as a one off rather than wrapping into a poll makes them extremely useful and avoids the race issues that exist with signal handlers and other non-blocking mechanisms.

    signalfd, timerfd, eventfd, userfaultfd, pidfd are all great applications of this strategy.

    • gpderetta 1 day ago
      One issue is that fds are a relatively limited and relatively heavyweight resource.
      • akira2501 1 day ago
        They used to be.

            $ cat /proc/sys/fs/file-max
            3235242
        
        And `nr_open` has been 1048576 for a long time. The default quota limit is 1024 but that is easy to change.
  • moron123 2 days ago
    Parenting 101
  • JackSlateur 3 days ago
    What is the meaning of this code ?

      void on_sigchld(int sig) { (void)sig; }
    • naruhodo 3 days ago
      If it's C code, that is the way to suppress a compiler warning about sig being unused. In C++ you can omit (or comment-out) the parameter name, e.g.:

          // C++
          void on_sigchld(int /*sig*/) {}
  • tlsalmin 9 hours ago
    First nitpick:

       static int pipe_fd[2] = {0};
    
    0 is valid fd, so I recommend initializing fds to -1.

    signalfd was just off-hand mentioned, but for writing anything larger, like lets say a daemon process, it keeps things close to all the other events being reacted to. E.g.

      #include <signal.h>
      #include <unistd.h>
      #include <stdio.h>
      #include <stdlib.h>
      #include <sys/timerfd.h>
      #include <sys/signalfd.h>
      #include <sys/epoll.h>
    
      static int signalfd_init(void)
        {
          sigset_t sigs, oldsigs;
          int sfd = -1;
    
          sigemptyset(&sigs);
          sigemptyset(&oldsigs);
          sigaddset(&sigs, SIGCHLD);
          if (!sigprocmask(SIG_BLOCK, &sigs, &oldsigs))
            {
              sfd = signalfd(-1, &sigs, SFD_CLOEXEC | SFD_NONBLOCK);
              if (sfd != -1)
                {
                  // Success
                  return sfd;
                }
              else
                {
                  perror("signalfd");
                }
              sigprocmask(SIG_SETMASK, &oldsigs, NULL);
            }
          else
            {
              perror("sigprocmask");
            }
          return -1;
        }
    
      static int timerfd_init(void)
        {
          int tfd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK | TFD_CLOEXEC);
    
          if (tfd != -1)
            {
              struct itimerspec tv =
                {
                  .it_value = 
                    {
                      .tv_sec = 5
                    }
                };
              if (!timerfd_settime(tfd, 0, &tv, NULL))
                {
                  return tfd;
                }
              else
                {
                  perror("timerfd_settime");
                }
              close(tfd);
            }
          else
            {
              perror("timerfd_create");
            }
          return -1;
        }
    
      static int epoll_init(int sfd, int tfd)
        {
          int efd;
    
          if (!sfd || !tfd)
            {
              return -1;
            }
    
          efd = epoll_create1(EPOLL_CLOEXEC);
          if (efd != -1)
            {
              struct epoll_event ev[2] =
                {
                    {
                      .events = EPOLLIN,
                      .data =
                        {
                          .fd = sfd,
                        }
                    },
                    {
                      .events = EPOLLIN,
                      .data = 
                        {
                          .fd = tfd
                        }
                    }
                };
              if (!epoll_ctl(efd, EPOLL_CTL_ADD, sfd, &ev[0]) &&
                  !epoll_ctl(efd, EPOLL_CTL_ADD, tfd, &ev[1]))
                {
                  return efd;
                }
              else
                {
                  perror("epoll_ctl");
                }
              close(efd);
            }
          else
            {
              perror("epoll_create1");
            }
          return -1;
        }
    
      int main(int argc, char *argv[])
        {
          int exit_value = EXIT_FAILURE;
          int sfd = signalfd_init(),
              tfd = timerfd_init(),
              efd = epoll_init(sfd, tfd);
    
          if (sfd != -1 && tfd != -1 && efd != -1)
            {
              int child_pid = fork();
    
              if (child_pid != -1)
                {
                  if (!child_pid)
                    {
                      argv += 1;
                      if (-1 == execvp(argv[0], argv)) {
                          exit(EXIT_FAILURE);
                      }
                      __builtin_unreachable();
                    }
                  else
                    {
                      int err;
                      struct epoll_event ev;
    
                      while ((err = epoll_wait(efd, &ev, 1, -1)) > 0)
                        {
                          if (ev.data.fd == tfd)
                            {
                              // Read the signalfd for the possible SIGCHLD and
                              exit_value = EXIT_SUCCESS;
                            }
                          else if (ev.data.fd == tfd)
                            {
                              // Timer triggered, kill the child process.
                            }
                        }
                      if (err == -1)
                        {
                          perror("epoll_wait");
                        }
                    }
                }
              else
                {
                  perror("fork");
                }
            }
          close(sfd);
          close(tfd);
          close(efd);
          exit(exit_value);
        }
  • o11c 1 day ago
    > Because the Linux kernel coalesces SIGCHLD (and other signals), the only way to reliably determine if a monitored process has exited, is to loop through all PIDs registered by any kqueue when we receive a SIGCHLD. This involves many calls to waitid(2) and may have a negative performance impact.

    This is somewhat wrong. To speed things up in the happy case (where we are the only part of the program that is spawning children), you can just do a `WNOHANG` wait for any child first, and check if it's one of the children we care about. Only if it's an unknown child do you have to do the full loop (of course, if you only have a couple of children the loop may be better).

  • eduction 1 day ago
    He mentions Bryan Cantrill in there and I can’t resist posting his famous epoll/kqueue rant:

    https://youtu.be/l6XQUciI-Sc?t=3643

    I know this is related but maybe someone smarter than me can explain how closely it relates (or doesn’t) to this issue which seems more general (iirc Cantrill was talking about fs events not child processes generally)

  • griomnib 4 days ago
    [dead]