Forking Operating System Processes from within Pharo Language
# OSSubprocess
OSSubprocess allows users to spawn Operating System processes from within Pharo language. The main usage of forking external OS processes is to execute OS commands (.e.g `cat`, `ls`, `ps`, `cp`, etc) as well as arbitrary shell scripts (.e.g `/etc/myShellScript.sh`) from Pharo. Up until now OSSubprocess works on Mac and Linux.
An important part of OSSubprocess is how to manage standard streams (`stdin`, `stdout` and `stderr`) and how to provide an API for reading and writing from them at the language level.
> It was decided together with Pharo Consortium that as a first step, we should concentrate on making it work on OSX and Unix. If the tool proves to be good and accepted, we could, at a second step, try to add Windows support. If you need Windows support, you can take a look at the https://github.com/pharo-contributions/OSWinSubprocess project.
## Table of Contents
* [Installation](#installation)
* [Getting Started](#getting-started)
* [API Reference](#api-reference)
* [Child exit status](#child-exit-status)
* [OSSVMProcess and it's child watcher](#ossvmprocess-and-its-child-watcher)
* [Accessing child status and interpreting it](#accessing-child-status-and-interpreting-it)
* [Streams management](#streams-management)
* [Handling pipes within Pharo](#handling-pipes-within-pharo)
* [Regular files vs pipes](#regular-files-vs-pipes)
* [Customizing streams creation](#customizing-streams-creation)
* [Stdin example](#stdin-example)
* [Synchronism and how to read streams](#synchronism-and--how-to-read-streams)
* [Synchronism vs asynchronous runs](#synchronism-vs-asynchronous-runs)
* [When to process streams](#when-to-process-streams)
* [Streams processing at the end](#streams-processing-at-the-end)
* [Semaphore-based SIGCHLD waiting](#semaphore-based-sigchld-waiting)
* [Delay-based polling waiting](#delay-based-polling-waiting)
* [Which waiting to use?](#which-waiting-to-use)
* [Processing streams while running](#processing-streams-while-running)
* [Asynchronous runs](#asynchronous-runs)
* [Sending signals to processes](#sending-signals-to-processes)
* [System shutdown](#system-shutdown)
* [Environment variables](#environment-variables)
* [Setting environment variables](#setting-environment-variables)
* [Variables are not expanded](#variables-are-not-expanded)
* [Accessing environment variables](#accessing-environment-variables)
* [Inherit variables from parent](#inherit-variables-from-parent)
* [Shell commands](#shell-commands)
* [Setting working directory](#setting-working-directory)
* [History](#history)
* [Future work](#future-work)
* [License](#license)
* [Acknowledgments and Funding](#acknowledgments-and-funding)
## Installation
OSSubprocess only works in Pharo >= 5.0 with Spur VM.
> Important: Do not load OSProcess project in the same image of OSSubprocess because the latter won't work.
### Pharo 5.0 to 8.0
Use the latest compatible version: `v1.3.0`.
```Smalltalk
Metacello new
baseline: 'OSSubprocess';
repository: 'github://pharo-contributions/OSSubprocess:v1.3.0/repository';
load.
```
> Important2: If you are installing under Linux, then you must use a threaded heartbeat VM (not the itimer one). For Pharo 5.0 and 6.0 you can search for "cog_linux32x86_pharo.cog.spur_XXXXXXXXXXXX.tar.gz" i32 [Pharo static file server](http://files.pharo.org/vm/pharo-spur32/linux/). Since Pharo 7.0, threaded heartbeat VM has become the default installation, so you shouldn't have to explicitly download a specific VM.
### Pharo 9.0 or above
You need to use the `master` branch or a version `> v1.3.0` (API changes).
```Smalltalk
Metacello new
baseline: 'OSSubprocess';
repository: 'github://pharo-contributions/OSSubprocess:master/repository';
load.
```
## Getting Started
OSSubprocess is quite easy to use but depending on the user needs, there are different parts of the API that could be used. We start with a basic example and later we show more complicated scenarios.
```Smalltalk
OSSUnixSubprocess new
command: '/bin/ls';
arguments: #('-la' '/Users');
redirectStdout;
runAndWaitOnExitDo: [ :process :outString |
outString inspect
]
```
Until we add support for Windows, the entry point will always be OSSUnixSubprocess, which should work in OSX, Linux and others Unix-like. You can read it's class comments for details.
A subprocess consist of at least a command/binary/program to be executed (in this example `/bin/ls`) plus some optional array of arguments.
The `#command:` could be either the program name (.e.g `ls`) or the full path to the executable (.e.g `/bin/ls`). If the former, then the binary will be searched using `$PATH` variable and may not be found.
For the `#arguments:` array, each argument must be a different element. In other words, passing `#('-la /Users')` is not correct since those are 2 arguments and hence should be 2 elements of the array. It is also incorrect to not specify `#arguments:` and specify the command like this: `command: '/bin/ls -la /Users'`. OSSubprocess does *not* do any parsing of the command or arguments. If you want to execute a command with a full string like `/bin/ls -la /Users`, you may want to take a look to `#bashCommand:` which relies on shell to do that job.
With `#redirectStdout` we are saying that we want to create a stream and that we want to map it to `stdout` of the child process. Since they are not specified, `stderr` and `stdin` will then be inherit from the parent process (Pharo VM process). If you comment out the line of `#redirectStdout` and run the example again, you can see how the output of `/bin/ls -la /Users` is printed in the terminal (where you launched your Pharo image).
Finally, we use the `#runAndWaitOnExitDo:` which is a high level API method that runs the process, waits for it until it finishes, reads and then closes the `stdout` stream, and finally invokes the passed closure. In the closure we get as arguments the original `OSSUnixSubprocess` instance we created, and the contents of the read `stdout`. If you inspect `outString` you should see the output of `/bin/ls -la /Users` which should be exactly the same as if run from the command line.
## API Reference
### Child exit status
When you spawn a process in Unix, the new process becomes a "child" of the "parent" process that launched it. In our case, the parent process is the Pharo VM process and the child process would be the one executing the command (in above example, `/bin/ls`). It is a responsibility of the parent process to collect the exit status of the child once it finishes. If the parent does not do this, the child becomes a "zombie" process. The exit status is an integer that represents how the child finished (if successful, if error, which error, if received a signal, etc etc.). Besides avoiding zombies, the exit status is also important for the user to take actions depending on its result.
#### OSSVMProcess and it's child watcher
In OSSubprocess, we have a class `OSSVMProcess` with a singleton instance accessed via a class side method `vmProcess` which represents the operating system process in which the Pharo VM is currently running. OSSVMProcess can answer some information about the OS process running the VM, such as running PID, children, etc etc. More can be added later.
Another important task of this class is to keep track of all the launched child processes (instances of `OSSUnixSubprocess`). Whenever a process is started it's registered in OSSVMProcess and unregister in certain scenarios (see senders of ``#unregisterChildProcess:``). We keep a list of all our children, and occasionally prune all those that have already exited.
This class takes care of running what we call the "child watcher" which is basically a way to monitor children's status and collect exit code when they finish. This also guarantees not to create zombies processes. As for the implementation details, we use a SIGCHLD handler to capture a child death. For more details, see method `#initializeChildWatcher`.
*What is important here is that whether you wait for the process to finish or not (and no matter in which way you wait), the child exit code will be collected and stored in the `exitStatus` instVar of the instance of `OSSUnixSubprocess` representing the exited process, thanks to the `OSSVMProcess` child watcher.*
#### Accessing child status and interpreting it
No matter how you waited for the child process to finish, when it exited, the instVar `exitStatus` should have been set. `exitStatus` is an integer bit field answered by the `wait()` system call that contains information about the exit status of the process. The meaning of the bit field varies according to the cause of the process exit. Besides understanding `#exitStatus`, `OSSUnixSubprocess` also understands `#exitStatusInterpreter` which answers an instance of `OSSUnixProcessExitStatus`. The task of this class is to simply decode this integer bit field and provide meaningful information. Please read its class comment for further details.
In addition to `#exitStatus` and `#exitStatusInterpreter`, `OSSUnixSubprocess` provides testing methods such us `#isSuccess`, `#isComplete`, `isRunning`, etc.
Let's see a possible usage of the exit status:
```Smalltalk
OSSUnixSubprocess new
command: '/bin/ls';
arguments: #('-la' '/nonexistent');
redirectStdout;
redirectStderr;
runAndWaitOnExitDo: [ :process :outString :errString |
process isSuccess
ifTrue: [ Transcript show: 'Command exited correctly with output: ', outString. ]
ifFalse: [
"OSSUnixProcessExitStatus has a nice #printOn: "
Transcript show: 'Command exit with error status: ', process exitStatusInterpreter printString; cr.
Transcript show: 'Stderr contents: ', errString.
]
]
```
In this example we execute `/bin/ls` passing a none existing directory as argument.
First, note that we add also a stream for `stderr` via `#redirectStderr`. Sec