Parallel IT - Parallel Iterator, Parallel Task and Pyjama

Parallel Task - Quick Reference Guide

This guide serves as a quick reference for the features supported by Parallel Task, including the new keywords, the syntax to use and the most important library functions you will probably use.

Keywords

Keywords for a Task declaration:

TASK
TASK(*)
IO_TASK
IO_TASK(*)

Keywords for a Task invocation:

dependsOn
notify ¹
notifyInterim
asyncCatch

Important library functions

For a complete list of the runtime classes and methods, browse the API docs. But for starters, the most common classes that you will make use of include ParaTask, CurrentTask, TaskID and TaskIDGroup:

Setting the ParaTask runtime environment:

ParaTask.init() is the only compulsory method to use, and should be called once at the beginning of the main() method in your application to initialise the runtime.
To change the default scheduling policy, see ParaTask.setScheduling(ScheduleType) and/or ParaTask.setThreadPoolSize(int)

Working from inside a task:

If you are "inside" a task and want to inquire certain information, have a look at the CurrentTask class. Some of the useful functions include:

CurrentTask.globalID() to determine the current task's unique identifier
CurrentTask.insideTask() to determine whether the current code segment is actually being executed as a task or as a standard sequential method, e.g.

TASK public int parallel() {
    return sequential();
}

public int sequential() {
    ...
    if (CurrentTask.insideTask()) {
       // this is being executed as a task
    } else {
       // this is being executed as a standard sequential method
    }
    ...
}

plus many more methods, categorised below into their respective topic...

Working from outside a task:

While CurrentTask contained methods to be invoked while inside a task, there is usually an equivalent method that may be invoked on a TaskID instance (i.e. for code "outside" the task). Some of these are categorised in the topics below, while others include:

getReturnResult() to access the return value from executing the task
waitTillFinished() to cause the current thread to wait until the task completes
getTaskArguments() if you want to access the original parameter values that were passed to the task
isInteractive() or isMultiTask() to check whether the task is an I/O task or multi-task respectively

Task progress:

A task may update its progress using CurrentTask.setProgress(int)
Listeners interested in a task's progress use getProgress() on the task's respective TaskID.
TIP: Updating the task's progress does not automatically notify the listeners, so you will most likely want to use the publish an intermediate result (or a "dummy" result) to notify the listeners of the task's updated progress.

Canceling tasks:

Tasks cannot be canceled abruptly. Instead, a cancel request must be sent to the task (via it's TaskID instance). If that task has not started executing yet, then it won't be executed. If it already has started executing, then the task must actively check whether a request has been made (and gracefully cancel on its own terms):

To attempt to cancel a task, invoke cancelAttempt() on the task's respective TaskID.

For the cancel attempt to have any effect, the task should periodically call CurrentTask. cancelRequested() to check if a request has been made. If so, then the task would take the necessary steps to clean up and end the computation.

Working with multi-tasks:

To define a multi-task, annotate a method declaration with either TASK(*) or IO_TASK(*). By default, this creates a sub-task for each worker thread.

Alternatively, instead of using '*', you may use an integer literal or interger variable to create a different sized multi-task.. e.g. TASK(myInt) or IO_TASK(2)

A invocation of a multi-task is like a standard task invocation, except that a TaskIDGroup is returned.
From inside the multi-task, inquire to see how many siblings (i.e. total number of sub-tasks) there are for the multi-task by calling CurrentTask.multiTaskSize().
Similarly, a sub-task may inquire on its respective position in the multi-task group relative to its siblings using CurrentTask.relativeID().
Each multi-task has a barrier allowing it to synchronise with its siblings, achieved by calling CurrentTask.barrier().
If the multi-task returns a result, you may access individual results using getInnerTaskResult(int) or perform a reduction to return a single result using getReturnResult(Reduction).
TIP: remember that multi-tasks are also tasks.. you may therefore use all the other features discussed!

Task dependences

Sometimes you want to invoke a task, but don't want it to start executing until a previous task has completed. You may specify this dependency with the dependsOn keyword. The variables inside the dependsOn clause must be TaskIDs or TaskIDGroups (and you may specify as many of each as you want) e.g.

TaskIDGroup grp = new TaskIDGroip(2);

TaskID id1 = firstTask();
TaskID id2 = secondTask();
grp.add(id1);
grp.add(id2);
TaskID id3 = thirdTask();

TaskID id = finalTask() dependsOn(grp, id3);

A new way of specifying dependences is to pass a TaskID<T> directly to a method that expects a parameter of type T. This signals to the compiler that the method is to be called once the first task completes with the return value of this TaskID as the parameter. In other words, the following two blocks of code are equivalent:

TASK public int firstTask() {
    return something();
}
TASK public void secondTask(TaskID<Integer> id) {
    System.out.println(id.getReturnResult());
}

TaskID id1 = firstTask();
TaskID id2 = secondTask(id1) dependsOn(id1);

// can be shortened to

TASK public int firstTask() {
    return something();
}
TASK public void secondTask(int value) {
    System.out.println(value);
}

TaskID id1 = firstTask();
TaskID id2 = secondTask(id1);

Non-blocking task completion:

If you don't want to block on the TaskID until the task completes (especially important for event processing threads), then you may use one of the non-blocking clauses.

notify, the methods inside the clause are executed by the GUI thread (Event Dispatch Thread), regardless of the enqueuing thread, e.g.

TaskID id = myTask() notify( taskCompleted(TaskID), myGUIComponent::update() );

TIPS:

If you don't specify an object instance, then this is used. Otherwise, you may specify another instance object to invoke the method on using the format instance::method as in the example above.
You may specify any number of methods in the notify clause, no need to put them all in one method.
The methods you register in the clauses must either accept a TaskID as parameter, or nothing at all. The TaskID parameter will always refer to the task that just completed.

Intermediate results:

When a task wants to publish an intermediate result, it may do so using CurrentTask.publishInterim(E)
To have any effect, there must be registered methods (using notifyInterim). The parameter list of those methods must be (TaskID, E) in that order, where:

The TaskID instance will represent the task that published the intermediate result, and
E will represent the actual intermediate result.

TIPS:

Just before publishing the result, the task could update it's progress (i.e. CurrentTask.setProgress(int)), and then the method would be able to read the task's progress using getProgress() on the TaskID instance.
The notifyInterim is similar to notify. The difference is that an extra parameter is added to the methods (i.e. the E), and the methods are only executed when/if the task decides to publish a result.

Exception handling

Some tasks, like standard methods, might throw exceptions. If these exceptions are checked exceptions, then the programmer must adhere to Java's Catch or Specify Requirement and use the asyncCatch clause, e.g.

TASK public void process(File file) throws IOException {
    ...
}

TaskID id = process(file) asyncCatch( IOException            fileHandler(TaskID),
                                                             RuntimeException myHandler(TaskID),
                                                           Exception       endProgram() );

In the above example, we MUST use asyncCatch to catch the checked exception (IOException). We decided to also use it to catch any other non-checked exceptions (i.e. optional usage). Similar to the notify clause, the methods inside the asyncCatch clause will be executed by the enqueuing thread.
Below is an example of an exception handler:

public void fileHandler(TaskID id) {
    Exception e = id.getException();
    Object[] args = id.getTaskArguments();
    File f = (File) args[0];
    System.err.println("Problem processing task "+id.getTaskName()+" with file "+f.getName());
    e.printStackTrace();
}

Pipelines

Pipelines are useful in situations where data arrives in a streamed format, such as data over a network or data that is too large to all fit in memory at once. Tasks can be converted to efficiently loop on independent threads as pipeline stages instead of as one-off tasks by passing a BlockingQueue<T> in place of a parameter that expects a value of type T.

TASK public Image blur(Image input) {
    Image blurred = ...;
    return blurred;
}

TASK public Image filter(Image input) {
    Image filtered = ...;
    return filtered;
}

TASK public Image resize(Image input) {
    Image resized = ...;
    return resized;
}

BlockingQueue<Image> input = new LinkedBlockingQueue<>();
TaskID<Image> stage1 = blur(input);
TaskID<Image> stage2 = filter(stage1);
TaskID<Image> stage3 = resize(stage2);

//...

input.put(image1);
input.put(image2);
input.put(image3);

//...

BlockingQueue<Image> output = stage3.getOutputQueue();
Image out1 = output.take();
Image out2 = output.take();
Image out3 = output.take();

//...

stage1.cancelAttempt(); // flows down through all stages

^{1. The keyword notifyGUI and notifyInterimGUI in the old versions have been removed in the latest version.
↩}