I like to call a command line utility from R and parallelise the tool by calling several copies of it in different folders. Each call can have quite different runtimes, so I'd like to do this in an async way where the next call triggers as soon as one of the previous ones has resolved.
Async frameworks like the mirai package seem ideal for this, but I do not know, how I can "fix" the workers to the individual and predefined folders where the CLI tool waits for the next call.
For example, if tool1 in folder1 is still running and tool2 in folder2 has just finished, the dispatcher should assign the next call to tool2 in folder2. Maybe it finishes again very quickly and tool1 is still running, so the third call should also go to tool2 in folder2 and so on.
Would anyone have an idea that could get me on track to develop a solution for this?
Ideally, I could just use purrr:map functions, maybe with the new mirai parallelization in the purrr development version. mirai::mirai_map might also be a way, but again, how to dynamically distribute the tasks to the correct tool in the correct folder?
tool#a different executable and/or option set? You mention "tool1 in folder1" and "tool2 in folder2", are tool1/tool2 different other than the working directory? Is there order to the tasks that need to be run, or is it a random-access set of things that need to be done?