MPI: Concurrent File I/O for by Multiple Processes
In this tutorial, we’ll explore an MPI (Message Passing Interface) program using mpi4py
to demonstrate how multiple processors can collectively write to and read from a shared file.
Code
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size() # Define the file name
filename = "output.txt" # Open the file in append mode and write a message from each processor
with open(filename, "a") as file:
file.write(f"Hello from processor {rank} of {size}\n") # Synchronize all processors before reading the file
comm.Barrier() # Now let only one processor (e.g., rank 0) read and display the file content
if rank == 0:
# Open the file in read mode
with open(filename, "r") as file:
# Read and display the file content
print("File contents:")
print(file.read())
Code Explanation
from mpi4py import MPI
Imports the necessary MPI
module from mpi4py
which provides bindings for MPI functionality in Python.
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
Initializes MPI communication (comm
) for all processes (MPI.COMM_WORLD
). rank
is assigned the unique identifier (rank) of the current process, and size
represents the total number of processes.
# Define the file name
filename = "output.txt"
Sets the name of the file (output.txt
) that will be used for writing and reading.
# Open the file in append mode and write a message from each processor
with open(filename, "a") as file:
file.write(f"Hello from processor {rank} of {size}\n")
Opens output.txt
file in append mode ("a"
). Each MPI process writes a message to the file containing its rank (rank
) and the total number of processes (size
).
# Synchronize all processors before reading the file
comm.Barrier()
Ensures all MPI processes reach this point before proceeding, creating a synchronization barrier. This ensures all writes to output.txt
are complete before any process attempts to read from it.
# Now let only one processor (e.g., rank 0) read and display the file content
if rank == 0:
# Open the file in read mode
with open(filename, "r") as file:
# Read and display the file content
print("File contents:")
print(file.read())
Only the MPI process with rank == 0
executes the following block:
- Opens
output.txt
in read mode ("r"
). - Prints a header indicating file contents.
- Reads and displays the entire content of
output.txt
.
Steps to Execute the Program
- Save the Script: Save the above code to a file named
mpi_file_io.py
on your system. - Run the Program:
Open a terminal or command prompt. Run the MPI program using multiple processes:
mpiexec -n 4 python mpi_file_io.py
Replace 4
with the number of MPI processes you want to run (e.g., 4
processes in this case).
You have learned how to use MPI in Python (mpi4py
) to coordinate file I/O operations among multiple processors. This technique is essential for parallel computing tasks where data needs to be shared or coordinated among distributed processes. Experiment with different numbers of MPI processes to observe how the program behaves with varying levels of parallelism.