taskit

README excerpt

# TaskIt

[![Tests](https://github.com/pharo-contributions/taskit/actions/workflows/tests.yml/badge.svg)](https://github.com/pharo-contributions/taskit/actions/workflows/tests.yml)

>Anything that can go wrong, will go wrong. — Murphy's Law

Managing concurrency is a significant challenge when developing applications that scale. For a web application, we may want to use different processes for each incoming requests or we may want to use a [thread pool](https://en.wikipedia.org/wiki/Thread_pool). For a desktop application, we may want to do long-running computations in the background to avoid blocking the UI.

"Processes" in Pharo are implemented as [green threads](https://en.wikipedia.org/wiki/Green_thread) that are scheduled by the virtual machine rather than the underlying operating system. This has advantages and disadvantages:

- Processes are cheap to create and to schedule. We can create as many as them as we want, and performance depends on the code executed in those processes with very little process management overhead.
- While processes provide _concurrent_ execution, there is no real _parallelism_. Inside Pharo, however many processes we use, they will be always executed in a single operating system thread, in a single operating system process.

When managing the processes in our application, we need to know how to synchronize these processes. For example, we may want to execute two processes concurrently and have a third one wait for the completion of the first two before starting. Or maybe we want to maximize the parallelism of our application while enforcing concurrent access to some piece of state. And with all of this, we need to avoid deadlocks—a common problem with concurrency.

**TaskIt** is a Pharo library that provides abstractions to execute and synchronize concurrent tasks. This chapter starts by introducing TaskIt's abstractions using examples and code snippets and finishes with a discussion of TaskIt extension points and possible customizations.

## Introduction

Since version 9, Pharo's default image includes the `coreTests` group of `BaselineOfTaskIt`. The following instructions explain how to to load another group or load TaskIt in previous Pharo versions.

### Loading

If you want a specific release such as v1.0, you can load the associated tag as follows:

```smalltalk
Metacello new
baseline: 'TaskIt';
repository: 'github://pharo-contributions/taskit:v1.0';
load.
```

Otherwise, if you want the latest development version, load master:

```smalltalk
Metacello new
baseline: 'TaskIt';
repository: 'github://pharo-contributions/taskit';
load.
```

#### Adding TaskIt as a Metacello dependency

To add TaskIt to an existing applocation, add the following to your Metacello configuration or baseline with the desired version:

```smalltalk
spec
baseline: 'TaskIt'
with: [ spec repository: 'github://pharo-contributions/taskit:v1.0' ]
```

#### For developers

TaskIt code is on [GitHub](https://github.com/pharo-contributions/taskit) and we use [Iceberg](https://github.com/pharo-vcs/iceberg.git) for source code management. Just load Iceberg and enter GitHub's url to clone. Remember to switch to the desired development branch or create one on your own.

## Asynchronous Tasks

TaskIt's main abstraction are, as the name implies, tasks. A task is a unit of execution. If you split the execution of a program in several tasks, TaskIt can run those tasks concurrently, synchronize their access to data, and even help in ordering and synchronizing their execution.

### First Example

Launching a task is as easy as sending the message `schedule` to a block closure:
```smalltalk
[ 1 + 1 ] schedule.
```
>The selector `schedule` is used instead of `run`, `launch`, or `execute` to emphasize that a task will *eventually* be executed. In other words, a task is *scheduled* to be executed at some point in the future.

While a convenient demo, this first example is too simple. We are schedulling a task that does nothing useful, and we cannot even observe it's result (*yet*). Let's explore some other code snippets that clarify what's going on. The following code snippet will schedule a task that prints to the `Transcript`. Evaluating the expression shows that the task is actually executed.
```smalltalk
[ 'Happened' logCr ] schedule.
```
However, a trivial task runs so fast that it's difficult to tell if it's actually running concurretly to our main process or not. A better example is to schedule a long-running task. The following example schedules a task that waits for a second before writing to the transcript. While normal synchronous code would block the main thread, you'll notice that this one does not.
```smalltalk
[ 1 second wait.
'Waited' logCr ] schedule.
```

### Schedule vs fork
You may wonder what's different between TaskIt's `schedule` and the built-in `fork`. From the examples above they seem equivalent. The short answer is that `fork` creates a new process every time it is called while `schedule` allows much more control: two tasks may execute (sequentially) inside a single process or (concurrently) in a pool of processes.

You will find a longer answer in the section below explaining *runners*. Briefly, TaskIt tasks are not directly scheduled in Pharo's global `ProcessScheduler` as usual `Process` objects are. Instead, a task is scheduled in a task runner. It is the responsibility of the task runner to execute the task.

### All valuables can be Tasks

So far we have been using block closures to define tasks. Block closures are a handy way to create a task since they implictly capture the context ( they have access to `self` and other objects in the scope). However, blocks are not always the wisest choice for tasks because each block references the current `context` with all the objects in it and its *sender contexts*, objects that might otherwise be garbage collected.

The good news is that TaskIt tasks can be represented by almost any object. A task, in TaskIt's domain are **valuable objects**, i.e., objects that will do some computation when they receive the `value` message. Actually, the message `schedule` in the above example is just a syntax sugar for:

```smalltalk
(TKTTask valuable: [ 'Happened' logCr ]) schedule.
```

We can then create tasks using any object that understands `value` (such as `MessageSend`):

```smalltalk
TKTTask valuable: (MessageSend receiver: 1 selector: #+ arguments: { 7 }).
```

We can even create our own task object:

```smalltalk
Object subclass: #MyTask
instanceVariableNames: ''
classVariableNames: ''
package: 'MyPackage'.

MyTask >> value
^ 100 factorial
```

and use it as follows:

```smalltalk
TKTTask valuable: MyTask new.
```

## Retrieving a Task's Result with Futures

A task can compute a value (such as the `1 + 1` example), or it can have a side-effect (such as printing to the Transcript), or it can have both a result and side-effect (while a task could do neither, that is not very useful!). When the result of a task is important to us (or we just want to know when it is done), we use TaskIt's *future* object. A [*future*](https://en.wikipedia.org/wiki/Futures_and_promises) is simply an object that represents the future value of the task's execution. We can schedule a task and obtain a future by using the `future` message on a block closure, as follows.

```smalltalk
aFuture := [ 2 + 2 ] future.
```

One way to see a future is as a placeholder. When a task is finished, it provides its result to its corresponding future. A future then provides access to the task's value—but since we cannot know *when* this value will be available, we cannot access it right away. We can either wait (blocking or synchronous) for the result or we can register a *callback* to be executed asynchronously when the task execution is finished.

>In general, *blocking* on a future should be avoided in the UI thread. In a background (non-UI) thead, however, blocking may be compeletely appropriate and this will be covered in later sections.

Like any other code, a task can complete normally or with an unhandled exception. A future supports these possibilities with callbacks using the methods `onSuccessDo:` and `onFailureDo:`. In the example below, we create a future and assign to it a success callback. As soon as the task finishes, the value gets deployed in the future and the callback is called with the resulting value.
```smalltalk
aFuture := [ 2 + 2 ] future.
aFuture onSuccessDo: [ :result | result logCr ].
```
We can also assign callbacks that handle a task's failure using the `onFailureDo:` message. If an exception occurs and the task cannot finish its execution as expected, the corresponding exception will be passed as argument to the failure callback, as in the following example.
```smalltalk
aFuture := [ Error signal ] future.
aFuture onFailureDo: [ :error | error sender method selector logCr ].
```

Futures accept more than one callback. When a task is finished, all its callbacks will be *scheduled* for (eventual) execution. There is no guarantee of the **timing** or **order** of the execution. The following example shows how we can register several success callbacks for the same future.

```smalltalk
future := [ 2 + 2 ] future.
future onSuccessDo: [ :v | FileStream stdout nextPutAll: v asString; cr ].
future onSuccessDo: [ :v | 'Finished' logCr ].
future onSuccessDo: [ :v | [ v factorial logCr ] schedule ].
future onFailureDo: [ :error | error logCr ].
```

Callbacks can be registered while the task is still running as well as after it finishes. If the task is running, callbacks are saved and wait for the completion of the task. If the task is already finished, the callback will be immediately scheduled with the previously computed value. The following example illustrates this: we first create a future and register a callback before it is finished, then we wait for its completion and register a second callback afterwards. Both callbacks are sche

Description

Details

Categories

README excerpt