Write a file-in-file-out program in bash and running processes in parallel

In this script, we use a concept called file-in-file-out (FIFO), also known as pipes, to pass along a parameter to several “worker” scripts. These workers operate in parallel (in other words, mostly independent of the master process), read an input, and execute a command. FIFOS are useful because they can reduce file system activities or input/output (IO), and data can flow directly to listeners or recipients. They are represented on the file system as files and are bidirectional—they can be read and written to at the same time.

Prerequisites

To create FIFOs, we use the mkinfo command to create what appears to be a file (everything is a file in Linux). This file has a special property, though, which is different than normal files and also different from the pipes we had been previously using: the pipes, in this case, can allow for multiple readers and writers!

As with any file, you can also provide permissions using the -m flag such as this: -m a=rw, or use the mknod command (this isn’t covered as it requires that you use a second command called chown to change permissions after creation).

Write Script:

To start this exercise, we will introduce two terms: leader and follower, or master and worker. In this case, the master (the central host) will create the workers (or minions). While the recipe is a bit contrived, it should make for an easy go-to template for a simple named pipes or FIFO pattern. Essentially, there is a master that creates five workers, and those newly created workers echo out what is provided to them through the named pipe:

To get started, open a new terminal and create two new scripts: master.sh and worker.sh.

In master.sh, add the following contents:

master.sh

#!/bin/bash 
FIFO_FILE=/tmp/WORK_QUEUE_FIFO 
mkfifo "${FIFO_FILE}" 
NUM_WORKERS=5 
I=0 
while [ $I -lt $NUM_WORKERS ]; do 
	bash worker.sh "$I" & 
	I=$((I+1)) 
done 
I=0 
while [ $I -lt $NUM_WORKERS ]; do 
	echo "$I" > "${FIFO_FILE}" 
	I=$((I+1)) 
done 
sleep 5 
rm -rf "${FIFO_FILE}" 
exit 0

In worker.sh, add the following contents:

worker.sh

#!/bin/bash 
FIFO_FILE=/tmp/WORK_QUEUE_FIFO 
BUFFER="" 
echo "WORKER started: $1" 
while : 
do 
read BUFFER < "${FIFO_FILE}" 
if [ "${BUFFER}" != "" ]; then
	echo "Worker received: $BUFFER" 
	exit 1 
fi 
done 
exit 0

In the terminal, run the following command and observe the output:

$ bash master.sh

How this script works:

The idea of this script is that if you have several repetitive tasks such as bulk operations and potentially multiple cores, you can perform tasks in parallel (often seen in the Linux world as Jobs). This recipe creates a single master that spawns several worker scripts into the background, which await input from the named pipe. Once they read input from the named pipe, they will echo it to the screen and then exit. Eventually, the master will exit too, removing the pipe along with it:

In step 1, we open a new terminal and create the two scripts: master.sh and worker.sh.

In step 2, we create the master.sh script. It uses two while loops to create n numbers of worker scripts with $I identifiers and then sends the same number of values to the FIFO before sleeping/exit.

In step 3, we create the worker.sh script, which echos an initialization message and then waits until $BUFFER is not empty (NULL, as it can be sometimes referred to). Once $BUFFER is full or rather, contains a message, then it echos it to the console and the script exits.

In step 4, the console should contain an output similar to the following:

$ bash master.sh 
WORKER started: 0 
WORKER started: 1 
We got 0 
We got 1 
WORKER started: 4 
We got 2 
WORKER started: 2 
We got 3 
WORKER started: 3 
We got 4

With the two scripts working in tandem over the FIFO, a numeric value is passed between them and the workers perform their work. These values or messages could easily be modified so that the workers execute commands instead!

Note:

Notice that the output can be in a different order. This is because Linux is not deterministic and spawning processes or reading from the FIFO might be blocked, or someone might get there before it (due to scheduling). Keep this in mind as the FIFO is also not atomic or synchronous—if you wish to designate which message goes to what host, you could create an identifier or messaging scheme.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related Articles