Starting Workers
Note
Select a provider if you have not already.
The next step in defining a Provider is configuring how to start Parsl on newly-acquired resources. Starting Parsl occurs in two steps, each with their own Provider option:
Defining how to connect to each new compute node (
launcher)Defining how to initialize the computational information (
worker_init)
Selecting a Launcher
Parsl uses Launchers to start workers on every node acquired by a Provider. Launchers are required when requesting more than one node per batch job from a cluster scheduler.
Parsl will use the same mechanism employed for launching MPI tasks across compute nodes. Consult the documentation or example scripts for a compute cluster to find how to launch MPI tasks, and then find the launcher script which matches.
No options are required in most cases.
Simply import the Launcher class and provide an instance of the launcher to the launcher option of the Provider.
config = Config(
executors=[
...
provider=SlurmProvider(
...
launcher=SrunLauncher(),
...
),
)
],
)
For example, the Polaris supercomputer at ALCF
suggests to use mpiexec to start jobs
and Parsl will use the MpiExecLauncher.
Advanced: Configuring a Launcher
Common cases for configuring a launcher include:
Defining binding between the Parsl processes and available cores of the computer, as when using Parsl on the Aurora supercomputer.
Launching Parsl inside of Shifter containers by prepending the command to start Parsl with the
WrappedLauncher.
Worker Initialization
The worker_init argument of all Providers accepts a script for readying a newly-acquired node to run Parsl.
Worker initialization commands typically perform tasks such as:
Activating the Python virtual environment which contains Parsl
Loading modules and setting environment variables of codes invoked by Parsl Apps
Launching services required for proper execution (e.g., port forwarding, NVIDIA MPS Daemons)
Most worker_init scripts activate a virtual environment then check that the proper
version of Python is available.
config = Config(
executors=[
...
provider=SlurmProvider(
...
worker_init="conda activate /path/to/conda/env; which python"
...
),
)
],
)
Note
Consider including commands which print information about a node (e.g., hostname)
or the computational environment (e.g., echo $PATH) as part of the worker init
to aid determining where Parsl is running or whether the init function created the
desired environment.