0

I am working on a project in go (which I am not very familiar with), that runs as a systemd process and spawns child processes, but whenever the systemd service is restarted, I see warnings (sometimes hundreds) like this one:

Sep 23 11:04:06 systemd[1648]: deckmaster.service: Found left-over process 14274 (kioworker) in control group while starting unit. Ignoring.
Sep 23 11:04:06 systemd[1648]: deckmaster.service: This usually indicates unclean termination of a previous run, or service implementation deficiencies.

Ok, so my service has "implementation deficiencies" as per systemd's opinion and I would like to fix it.

I am starting child processes with this method:

func executeCommand(cmd string) error {
    args := SPACES.Split(cmd, -1)
    exe := expandExecutable(args[0])

    command := exec.Command(exe, args[1:]...)
    if err := command.Start(); err != nil {
        errorLogF("failed to execute '%s %s'", exe, args[1:])
        return err
    }
    return command.Process.Release()
}

ChatGPT suggested I listened to SIGCHLD signals and used syscall.Wait4 to fully detach from the child process, so I am running this on the service start:

func reapChildProcesses() {
    sigs := make(chan os.Signal)
    signal.Notify(sigs, syscall.SIGCHLD)

    for range sigs {
        for {
            var status syscall.WaitStatus
            pid, err := syscall.Wait4(-1, &status, syscall.WNOHANG, nil)
            if pid <= 0 || err != nil {
                break
            }
            verboseLog("reaped pid=%d status=%d", pid, status.ExitStatus())
        }
    }
}
...
go reapChildProcesses()

The reaped... message only pops up when I manually close the child process, which means if I restart my service before closing the child process, it will never trigger the SIGCHLD signal.

systemd unit file deckmaster.service if relevant...

[Unit]
Description=Deckmaster Service
ConditionPathExists=/dev/streamdeck-xl
After=plasma-plasmashell.service
Wants=plasma-plasmashell.service

[Service]
Restart=on-failure
RestartSec=3s
KillMode=process
Environment=PATH=%h/.bin:%h/.scripts/bin:/usr/local/bin:/usr/bin
WorkingDirectory=%h/.local/share/deckmaster
ExecReload=kill -HUP $MAINPID
ExecStartPre=go build -C %h/Development/go/deckmaster -v
ExecStartPre=sleep 2s
ExecStart=%h/.bin/deckmaster -deck 'main.deck' -sleep 30m -brightness 2

The whole project can be found on github.com/tvidal-net/deckmaster

What would be the "non-deficient" way of implementing a child process release to satisfy systemd's demands?

3
  • A child process is collected by using Wait. What are you trying to do that you can’t just call the Wait method? Commented Sep 23 at 11:49
  • This is exactly the opposite of what I need. I want the child processes to be detached from the parent so the parent can be closed/reopened without systemd complaining, not block the parent with Wait until the child process finishes, from go.dev Release documentation: > Release releases any resources associated with the Process p, rendering it unusable in the future. Release only needs to be called if Process.Wait is not. As I'm calling Process::Release, calling Process::Wait shouldn't be necessary. Commented Sep 23 at 14:04
  • Oh! I see now, I was confused by your attempt to Release since it specifically says that it renders the process unusable. That's not going to disown the process, it's just cleaning up resources in Go. Commented Sep 23 at 14:59

2 Answers 2

-1

Remove the KillMode=process.

The default mode of operation is that systemd will first kill your main process and then everything else in the service's cgroup. It's not a problem if there are leftover child processes; they will be cleaned up as part of "stopping" the .service unit.

But your KillMode= specifically opts out of that cleanup, so it is up to you to keep track of everything that you've spawned... and everything that those programs spawn, and so on, and so on. If that's not what you want, then you should return to the default KillMode.

(In the default mode, you don't need to explicitly reap everything since those leftover processes will get reparented to pid1 after your main process exits – though you should reap those processes which you normally expect to exit before you, e.g. short-lived commands that you spawn must be eventually collected, although I assume .Process.Release() does exactly that.)

…Whereas if you do want certain processes to stay around, then you can either a) ignore the log message, or b) use systemd API to move a spawned process into its own cgroup.

For example, GNOME Shell and I believe also KDE Plasma nowadays use systemd API to put every "app" in its separate .scope unit, away from the main gnome-shell.service. (This could be done by spawning those programs through systemd-run --scope, but the corresponding 'CreateTransientUnit' D-Bus API is more flexible.)

Sign up to request clarification or add additional context in comments.

1 Comment

You are right, I am using KillMode=process to prevent child processes to stop, for example, if I open spotify, I don't want the music to stop when I restart the service. I think the systemd-run approach maybe with a different scope or slice answers my issue, I'll give it a shot. Thank you!
-2

Well, I noticed the processes I start under zsh with &! or tmux are not closed when the shell is closed. A bit of investigation and more ChatGPT talk and paying more attention to the systemd message, is says "Found left-over process ... in control group", so maybe I just need to spawn child processes in a control group different from the parent.

This is what I came up in the end, seems to be working as I wanted, I am just not sure if this is an "elegant" solution, though.

var (
    childProcessCGroup = createNewCGroup("deckmaster.scope")
)

func runningCGroup() string {
    s, err := os.ReadFile("/proc/self/cgroup")
    if err != nil {
        errorLogF("Unable to read the cgroup for the current process")
        panic(err)
    }
    split := strings.Split(string(s), ":")
    return strings.TrimSpace(split[len(split)-1])
}

func createNewCGroup(name string) string {
    current := runningCGroup()
    cgroupPath := path.Join("/sys/fs/cgroup", filepath.Dir(current), name)
    if err := os.MkdirAll(cgroupPath, 0755); err != nil {
        errorLogF("Unable to create new cgroup for child processes\n\t", cgroupPath)
        panic(err)
    }
    return cgroupPath
}

func moveProcessToCGroup(pid int, cgroup string) error {
    cgroupFile := path.Join(cgroup, "cgroup.procs")
    return os.WriteFile(cgroupFile, []byte(strconv.Itoa(pid)), 0644)
}

func executeCommand(cmd string) error {
    args := SPACES.Split(cmd, -1)
    exe := expandExecutable(args[0])

    command := exec.Command(exe, args[1:]...)
    if err := command.Start(); err != nil {
        errorLogF("failed to execute '%s %s'", exe, args[1:])
        return err
    }
    pid := command.Process.Pid
    if err := moveProcessToCGroup(pid, childProcessCGroup); err != nil {
        errorLog(err, "Unable to move child process %d to cgroup", pid)
    }
    return command.Process.Release()
}

1 Comment

You might try just using Setpgid/Setsid rather than hardcoding the cgroup handling like this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.