BioSimSpace.Process.ProcessRunner¶
-
class
BioSimSpace.Process.
ProcessRunner
(processes, name='runner', work_dir=None)¶ A class for managing and running multiple simulation processes, e.g. a free energy simulation at multiple lambda values.
Since BioSimSpace handles its own background processes it is unsuitable for use with Python modules such as concurrent.futures, where use of objects like a ProcessPoolExecutor would lead to redundant processes, i.e. a process would be created, from which BioSimSpace would fork its own background process. Instead, we recommend using a ProcessRunner, which can handle the running of processes for you, both in serial and parallel.
At present there is no way to allocate specific compute resources to individual processes. As such, unless you have access to a large amount of compute, when executing the runner in parallel we recommend that the individual processes are serial in nature. BioSimSpace is not intended to be a workflow manager and the ProcessRunner is only meant to help facilitate running of more complex, multi-leg simulation processes. If you desire more fine-grained resource control we recommend breaking your workflow into separate
nodes
, which can be run independently and allocated their own specific resources.-
__init__
(processes, name='runner', work_dir=None)¶ Constructor.
- Parameters
processes ([
Process
]) – A list of process objects.name (str) – The name of the of processes.
work_dir (str) – The working directory for the processes.
Methods
__init__
(processes[, name, work_dir])Constructor.
addProcess
(process)Add a process to the runner.
errored
()Return the indices of the errored processes.
getName
()Return the process runner name.
isError
()Return whether each process is in an error state.
isQueued
()Return whether each process is queued.
Return whether each process is running.
kill
(index)Kill a specific process. The same can be achieved using:
killAll
()Kill all of the processes.
nError
()Return the number of errored processes.
Return the number of processes.
nQueued
()Return the number of queued processes.
nRunning
()Return the number of running processes.
Return the list of processes.
queued
()Return the indices of the queued processes.
removeProcess
(index)Remove a process from the runner.
Restart any jobs that are in an error state.
runTime
()Return the run time for each process.
running
()Return the indices of the running processes.
setName
(name)Set the process runner name.
start
(index)Start a specific process. The same can be achieved using:
startAll
([serial, batch_size, max_retries])Start all of the processes.
wait
()Wait for any running processes to finish.
workDir
()Return the working directory.
-
addProcess
(process)¶ Add a process to the runner.
-
errored
()¶ Return the indices of the errored processes.
- Returns
idx_errored – A list containing the indices of the errored processes.
- Return type
[int]
-
getName
()¶ Return the process runner name.
- Returns
name – The name of the process.
- Return type
str
-
isError
()¶ Return whether each process is in an error state.
- Returns
is_error – A list indicating whether each process is in an error state.
- Return type
[bool]
-
isQueued
()¶ Return whether each process is queued.
- Returns
is_queued – A list indicating whether each process is queued.
- Return type
[ bool ]
-
isRunning
()¶ Return whether each process is running.
- Returns
is_running – A list indicating whether each process is running.
- Return type
[ bool ]
-
kill
(index)¶ Kill a specific process. The same can be achieved using:
.
runner.processes()[index].kill()
- indexint
The index of the process.
-
killAll
()¶ Kill all of the processes.
-
nError
()¶ Return the number of errored processes.
- Returns
n_error – The number of processes that are in an error state.
- Return type
int
-
nProcesses
()¶ Return the number of processes.
- Returns
n_processes – The number of processes managed by the runner.
- Return type
int
-
nQueued
()¶ Return the number of queued processes.
- Returns
n_queued – The number of processes that are queued.
- Return type
int
-
nRunning
()¶ Return the number of running processes.
- Returns
n_running – The number of processes that are running.
- Return type
int
-
processes
()¶ Return the list of processes.
- Returns
processes – The list of processes.
- Return type
[
Process
]
-
queued
()¶ Return the indices of the queued processes.
- Returns
idx_queued – A list containing the indices of the queued processes.
- Return type
[int]
-
removeProcess
(index)¶ Remove a process from the runner.
- Parameters
index (int) – The index of the process.
-
restartFailed
()¶ Restart any jobs that are in an error state.
-
runTime
()¶ Return the run time for each process.
- Returns
run_time – A list containing the run time of each process.
- Return type
-
running
()¶ Return the indices of the running processes.
- Returns
idx_running – A list containing the indices of the running processes.
- Return type
[ int ]
-
setName
(name)¶ Set the process runner name.
- Parameters
name (str) – The process runner name.
-
start
(index)¶ Start a specific process. The same can be achieved using:
.
runner.processes()[index].start()
- indexint
The index of the process.
-
startAll
(serial=False, batch_size=None, max_retries=5)¶ Start all of the processes.
- Parameters
serial (bool) – Whether to start the processes in serial, i.e. wait for a process to finish before starting the next. When running in parallel (serial=False) care should be taken to ensure that each process doesn’t consume too many resources. We normally intend for the ProcessRunner to be used to manage single core processes.
batch_size (int) – When running in parallel, how many processes to run at any one time. If set to None, then the batch size will be set to the output of multiprocess.cpu_count().
max_retries (int) – How many times to retry a process if it fails.
-
wait
()¶ Wait for any running processes to finish.
-
workDir
()¶ Return the working directory.
- Returns
work_dir – The working directory.
- Return type
str
-