ParslPoolExecutor Interface
The parsl.concurrent.ParslPoolExecutor implements the “Executor” interface
from the concurrent.futures module in standard Python.
This reduced interface permits employing Parsl into applications already capable
of single-node parallelism using ProcessPoolExecutor.
Creating a ParslPoolExecutor
Note
See the Configuration section for how to define computational resources.
Create a ParslPoolExecutor using one of two methods:
Supplying a Parsl
Configthat defines how to create new workers. The executor will start a new Parsl Data Flow Kernel (DFK) when it is entered as a context manager.from parsl.concurrent import ParslPoolExecutor from parsl.config.htex_local import config # Mimicks the PythonPool with ParslPoolExecutor(config=config) as pool: ...
All resources will be closed when the block exits.
Supplying an already-started Parsl
DataFlowKernel(DFK). The Parsl DFK must be started and stopped separate from the executor.from parsl.concurrent import ParslPoolExecutor from parsl.config.htex_local import config import parsl with parsl.load(dfk) as dfk with ParslPoolExecutor(dfk=dfk) as pool: ... ...
Parsl will only shut when the outer block exits.
Use multiple types of resources within the same program
by creating multiple ParslPoolExecutors,
each mapped to different types of Parsl workers (also called “executors”).
Start by loading a Parsl Config that includes multiple executors.
Then create ParslPoolExecutor with different lists of “executors” that
their tasks will use.
with parsl.load(hybrid_config) as dfk, \
ParslPoolExecutor(dfk=dfk, executors=['gpu']) as pool_gpu, \
ParslPoolExecutor(dfk=dfk, executors=['cpu']) as pool_cpu:
...
Using a ParslPoolExecutor
The ParslPoolExecutor supports all functions from
the Executor interface except task cancellation.
The submit and map functions behave just as in ProcessPoolExecutor,
and also include the task chaining supported in App-based Parsl workflows.
from parsl.concurrent import ParslPoolExecutor
from parsl.config.htex_local import config # Mimicks the PythonPool
def f(x):
return x + 1
with ParslPoolExecutor(config=config) as pool:
# Submit a task then a task which depends on the result
future_1 = pool.submit(f, 1)
future_2 = pool.submit(f, future_1)
assert future_1.result() == 2
assert future_2.result() == 3
Tasks from the Executor and App-based interface may also be used together.
def f(x): return x + 1 @python_app def parity(x): return 'odd' if x % 2 == 1 else 'even' with ParslPoolExecutor(config=my_parsl_config) as executor: future_1 = executor.submit(f, 1) assert parity(future_1).result() == 'even' # Function chaining, as expected
Differences
The differences between the Parsl-based ParslPoolExecutor
and the Python ProcessPoolExecutor are:
Task Cancellation: Parsl does not support canceling tasks once submitted.
Defining Functions: Functions defined in modules work the same in both Parsl and the
ProcessPoolExecutor. However, those defined during execution (e.g., in the “main” file) behave differently.Parsl will serialize functions defined at runtime but they will not be able to access global variables (as is the case when using the spawn start method in the
ProcessPoolExecutor), which means modules must be imported inside the function. Follow the rules in the App guide.No Multiprocessing Objects: Tools such as the
multiprocessing.Queueand synchronization primitives are not compatible withParslPoolExecutor.Worker Initialization: The worker initialization functions from
ProcessPoolExecutorare not supported. Configure workers using theConfigobject instead.