Inter-Process Communication Using Pipes and Signals in C++ on Unix based machine
Inspired by Lovepreet Singh’s insightful article on creating a Unix-style process pipeline in C++, I embarked on this project to explore inter-process communication (IPC) in Unix-like operating systems. This article delves into the use of pipes and signals for IPC, demonstrating how to set up a communication channel between a parent and child process. By leveraging these powerful IPC mechanisms, we can achieve seamless data transfer and synchronization between processes, a fundamental aspect of system programming.
Understanding Process Communication
When we execute a program, it becomes a process. Typically, processes run in isolation, each in its own separate environment with allocated resources. However, there are situations where processes need to communicate with one another. How can this be accomplished?
One effective method is through the use of pipes. For instance, imagine you have used a command in a Unix-like terminal where the output of one command serves as the input to another. A common example is ls | sort
, where the output of ls
(which lists files in a directory) is passed directly into sort
to arrange the file names in order. The |
symbol, known as a pipe, facilitates this seamless flow of data between the two commands.
There are several other mechanisms for inter-process communication, each suitable for different scenarios:
- Message Queues: These allow processes to send and receive messages in a controlled manner, providing a way to manage complex communication patterns.
- Shared Memory: This is the fastest form of IPC as it allows multiple processes to access the same memory space. It is particularly useful when large amounts of data need to be shared.
- Semaphores: These are used to control access to shared resources by multiple processes, ensuring synchronization and preventing race conditions.
- Sockets: These provide a way for processes to communicate over a network, enabling IPC between processes on different machines.
- FIFOs (Named Pipes): Similar to pipes, but with the added ability to communicate between unrelated processes.
This article will guide you through implementing a simple yet effective pipeline using pipes and signals in C++. The implementation of pipes, while straightforward, provides a fascinating look into how processes can interact and share data in a Unix-like environment.
What are Pipes and Signals?
Pipes are one of the oldest and most widely used methods for inter-process communication (IPC) in Unix-based systems. A pipe creates a communication channel between two processes, allowing data to flow from one process to another in a unidirectional manner.
How Pipes Work
- Creation: A pipe is created using the
pipe()
system call, which provides a pair of file descriptors: one for reading and one for writing. - Data Flow: Data written to the write-end of the pipe can be read from the read-end. This allows one process to send data to another.
- Lifecycle: Pipes exist only as long as the processes that created them. Once both ends of the pipe are closed, the pipe ceases to exist.
Advantages of Pipes
- Simplicity: Easy to set up and use for basic IPC tasks.
- Efficiency: Suitable for quick, one-way data transfer between related processes.
Limitations of Pipes
- Unidirectional: Data flows only in one direction.
- Limited to Related Processes: Typically used between parent and child processes.
- No Persistence: Pipes do not persist beyond the life of the processes that created them.
Signals are a way for processes to communicate that an event has happened. They work like hardware interrupts and can be sent by one process to another or by the operating system to a process.
How Signals Work?
- Sending Signals: Signals can be sent using the
kill()
system call, which despite its name, can be used to send any signal, not just terminate a process. - Handling Signals: A process can set up a signal handler using the
signal()
function. This handler function will be called when the process receives the specified signal.
Common Signals:
SIGINT
: Interrupt signal (typically sent when the user presses Ctrl+C).SIGTERM
: Termination signal (request to terminate the process).SIGKILL
: Kill signal (forces the process to terminate immediately).SIGUSR1
andSIGUSR2
: User-defined signals for custom purposes.
Advantages of Signals
- Asynchronous: Can notify processes of events at any time.
- Simple Notification Mechanism: Useful for simple event notifications.
Limitations of Signals
- Limited Data: Signals can carry only a limited amount of information (usually just an integer).
- Overhead: Handling signals requires some overhead to set up handlers and manage state.
Code Implementation
In this section, we will walk through the implementation of a C++ program that uses pipes and signals for inter-process communication (IPC) on a Unix-based machine. The program sets up a pipeline where the output of one process is fed into another process, and a signal is used to coordinate the communication between the parent and child processes.
#include <iostream>
#include <unistd.h>
#include <csignal>
#include <sys/wait.h>
// Signal handler function
void handle_signal(int signal) {
printf("Received signal: %d\n", signal);
}
// Function to set up the pipeline between two processes
void pipeline(const char* process1, const char* inputFile, const char* process2, const char* searchString) {
int fd[2];
pipe(fd);
pid_t pid = fork();
if (pid > 0) { // Parent process
close(fd[0]); // Close the read end of the pipe
dup2(fd[1], STDOUT_FILENO); // Redirect standard output to the write end of the pipe
close(fd[1]); // Close the write end of the pipe
execlp(process1, process1, inputFile, nullptr);
std::cerr << "Failed to execute " << process1 << std::endl;
wait(NULL); // Wait for the child process to finish to prevent a zombie process being created
} else { // Child process
signal(SIGUSR1, handle_signal); // Set up signal handler
close(fd[1]); // Close the write end of the pipe
dup2(fd[0], STDIN_FILENO); // Redirect standard input to the read end of the pipe
close(fd[0]); // Close the read end of the pipe
pause(); // Wait for the signal
execlp(process2, process2, searchString, nullptr);
std::cerr << "Failed to execute " << process2 << std::endl;
}
}
int main() {
pipeline("/usr/bin/cat", "main.cpp", "/usr/bin/grep", "hello");
return 0;
}
Code Walkthrough
Including Headers:
- The code includes necessary headers for input/output operations (
iostream
), Unix system calls (unistd.h
), signal handling (csignal
), and process control (sys/wait.h
).
Signal Handler Function:
handle_signal(int signal)
: This function handles signals received by the process. When a signal is received, it prints the signal number.
Pipeline Function:
pipeline(const char* process1, const char* inputFile, const char* process2, const char* searchString)
: This function sets up the pipeline between two processes using pipes and signals.
Creating the Pipe:
pipe(fd)
: Creates a pipe and stores the file descriptors in thefd
array.fd[0]
is for reading, andfd[1]
is for writing.- File descriptors are integers that uniquely identify an open file in a Unix-like operating system. When a pipe is created using
pipe(fd)
, two file descriptors are generated: fd[0]
: The read end of the pipe. This end is used by the process that wants to read data from the pipe.fd[1]
: The write end of the pipe. This end is used by the process that wants to write data into the pipe.- In Unix-like operating systems such as Linux and macOS, there are three standard streams: the input stream (
stdin
), the output stream (stdout
), and the error stream (stderr
). By default, the terminal is connected to these streams, so any data sent tostdout
is printed on the terminal. - Standard streams are represented by file descriptors:
0
forstdin
(standard input)1
forstdout
(standard output)2
forstderr
(standard error)
This i sa program that simulates the behavior of the command cat main.cpp | grep hello.
It basically finds line of code in main.cpp that contains hello string and prints the whole line to the terminal.
Forking the Process:
pid_t pid = fork()
: Forks a child process. Thepid
will be greater than 0 in the parent process and 0 in the child process.- Forking creates a separate child process that will execute a different part of the code, enabling parallel execution and communication between the two processes.
Parent Process:
- The parent process closes the read end of the pipe, redirects standard output to the write end of the pipe, and then executes the first process using
execlp
.
Child Process:
- The child process sets up a signal handler for
SIGUSR1
, closes the write end of the pipe, redirects standard input to the read end of the pipe, and waits for a signal usingpause()
. Once the signal is received, it executes the second process usingexeclp
.
The execlp
calls are used to execute the external commands, and the signal
function sets up a handler to manage signals sent to the processes.
Detailed Analysis of Child and Parent Processes
To understand how the parent and child processes interact in our program, let’s break down their roles and actions in the context of using pipes and signals for inter-process communication (IPC).
Parent Process
Pipe Creation:
- The parent process begins by creating a pipe using the
pipe()
system call. This call creates a unidirectional data channel with two file descriptors:fd[0]
for reading andfd[1]
for writing.
Forking the process:
- The
fork()
system call is used to create a new process. This call clones the calling process, resulting in a parent and a child process. Thepid
returned byfork()
is used to differentiate between the parent and the child process.
Parent Process Execution:
- The parent process executes the following steps:
- Close the Read End of the Pipe:
- The parent closes the read end of the pipe (
fd[0]
) because it only needs to write data to the pipe. - Closing the unused read end prevents potential read operations from this end, ensuring that only the intended process (child) reads from the pipe. It also helps to avoid resource leaks.
- Redirect Standard Output to the Write End of the Pipe:
- The
dup2(fd[1], STDOUT_FILENO)
call duplicates the write end of the pipe (fd[1]
) onto the standard output file descriptor (STDOUT_FILENO
). This redirection means that any data written to standard output will go into the pipe. i.e stdout acts like fd[1]. - Redirecting standard output to the pipe ensures that the output of the
cat
command will be sent through the pipe instead of being printed to the terminal. - Close the Original Write End of the Pipe:
- After duplicating the file descriptor, the original write end of the pipe (
fd[1]
) is closed.
Execute the First Command:
- The parent process replaces its memory space with the
cat
command usingexeclp
. This command reads the contents ofmain.cpp
and writes it to its standard output, which is now redirected to the pipe. - why? because executing the
cat
command withexeclp
allows the parent process to produce the data that will be sent through the pipe to the child process - Wait for the Child Process:
- The parent process waits for the child process to complete using the
wait(NULL)
call. - Waiting for the child process ensures that the parent does not terminate or continue to other tasks until the child process has finished. It also prevents the child from becoming a zombie process by allowing the parent to collect its termination status and release system resources.
Child Process
- The child process sets up a signal handler to catch the
SIGUSR1
signal. This handler will execute thehandle_signal
function when the signal is received. - Setting up a signal handler allows the child process to respond to signals sent by the parent or other processes. In this case, it will pause and wait for a
SIGUSR1
(user defined)signal to continue execution
Child Process Execution:
- Close the Write End of the Pipe:
- The child closes the write end of the pipe (
fd[1]
) because it only needs to read data from the pipe. - Closing the unused write end prevents potential write operations from this end, ensuring that only the intended process (parent) writes to the pipe. It also helps to avoid resource leaks.
- The
dup2(fd[0], STDIN_FILENO)
call duplicates the read end of the pipe (fd[0]
) onto the standard input file descriptor (STDIN_FILENO
). This redirection means that any data read from standard input will come from the pipe. - Redirecting standard input to the pipe ensures that the input of the
grep
command will come from the pipe instead of the terminal. - After duplicating the file descriptor, the child process closes the original read end of the pipe (
fd[0]
). - Closing the original file descriptor after duplicating it avoids accidental reads and potential resource leaks.
Pause and Wait for Signal:
- The child process calls
pause()
to wait for a signal. This effectively puts the child process to sleep until it receives a signal. - Pausing the process allows it to wait for a synchronization signal from the parent before proceeding. This ensures that the child does not start processing data until it is ready or until it receives appropriate signal.
- After receiving the
SIGUSR1
signal and handling it, the child process replaces its memory space with thegrep
command usingexeclp
. This command reads from its standard input (now redirected to the pipe) and searches for the stringhello
. - Executing the
grep
command withexeclp
allows the child process to consume the data produced by the parent process and perform the intended search operation.
Summary :
Parent Process:
- Closes the read end of the pipe.
- Redirects standard output to the write end of the pipe.
- Executes the
cat
command to write data to the pipe. - Waits for the child process to finish.
Child Process:
- Sets up the signal handler.
- Closes the write end of the pipe.
- Redirects standard input to the read end of the pipe.
- Waits for a signal to proceed.
- Executes the
grep
command to read data from the pipe and search for a string.
In terminal 1 i have compiled the program disscussed above and created a main.cpp in the same directory as pipeline_signal.cpp.
#include <iostream>
int main() {
std::cout << "hello world" << std::endl;
return 0;
}
In Terminal 2 I sent the signal using the command
kill -SIGUSR1 <pid>
You can find out pid of your process by doing runng the ‘ps’ command.
Here is a short explanation of the code in very simplified terms:
Imagine you and your friend are doing a project where you need to write a report and your friend needs to check it for errors. You write your part and save it in a shared folder (creating a pipe). Your friend waits until you text them saying you’re done (sending a signal). Once they get your message, they open the file from the shared folder and start their work (reading from the pipe). This way, you both work together efficiently, making sure each step is completed in the right order.
Thanks for reading! Give a clap if you found this helpful!