MPI with Python: Calculating Squares of Array Elements Using Multiple Processors

Afzal Badshah, PhD
3 min readApr 3, 2024

In this lab tutorial, we will explore how to utilize multiple processors to compute the squares of elements in an array concurrently using the MPI (Message Passing Interface) library in Python, specifically using the mpi4py module. MPI is a widely-used standard for parallel computing in distributed memory systems. We’ll create a master-worker model where the master process distributes tasks to worker processes, each responsible for computing the square of a subset of the array elements. The detailed tutorial of MPI with a python can be visited here.

Code

from mpi4py import MPI
comm = MPI.COMM_WORLD  # Initialize MPI communicator
rank = comm.Get_rank() # Get the rank of the current process (0 for master, 1+ for workers)
size = comm.Get_size() # Get the total number of processes
# Define a simple task for each process (replace with your actual workload)
def calculate_square(number):
return number * number
if rank == 0: # Master process
data = [2, 4] # Sample data (modify with your data)
for i in range(1, size): # Send data to worker processes (skip rank 0 - master)
comm.send(data[i - 1], dest=i)
# Receive results from workers
results = []
for i in range(1, size):
result = comm.recv(source=i)
results.append(result)
print("Master received results:", results)
else: # Worker processes (rank 1+)
data = comm.recv(source=0) # Receive data from master
result = calculate_square(data)
comm.send(result, dest=0) # Send result back to master
MPI.Finalize() # Finalize MPI environment

Code Explanation

Let’s go through each line of the provided code and understand its functionality:

from mpi4py import MPI

This line imports the necessary MPI functionality from the mpi4py library.

comm = MPI.COMM_WORLD  # Initialize MPI communicator

Here, we initialize the MPI communicator comm, which will be used for communication between different processes.

rank = comm.Get_rank()  # Get the rank of the current process (0 for master, 1+ for workers)

This line retrieves the rank of the current process within the communicator. The rank identifies each process uniquely, with rank 0 typically representing the master process.

size = comm.Get_size()  # Get the total number of processes

This line obtains the total number of processes within the communicator.

def calculate_square(number):
return number * number

This function calculate_square() takes a number as input and returns its square. This is the task that each worker process will perform.

if rank == 0:  # Master process
data = [2, 4] # Sample data (modify with your data)
for i in range(1, size): # Send data to worker processes (skip rank 0 - master)
comm.send(data[i - 1], dest=i)

In the master process (rank 0), we define the sample data [2, 4]. Then, we iterate over all worker processes (ranks 1 to size-1) and send each element of the data array to the corresponding worker process.

results = []
for i in range(1, size):
result = comm.recv(source=i)
results.append(result)
    print("Master received results:", results)

After sending data to the worker processes, the master process waits to receive results from each worker process. It iterates over all worker processes, receives their results, and appends them to a list called results. Finally, it prints the received results.

else:  # Worker processes (rank 1+)
data = comm.recv(source=0) # Receive data from master
result = calculate_square(data)
comm.send(result, dest=0) # Send result back to master

In worker processes (ranks 1+), each process receives data from the master process, computes the square using the calculate_square() function, and sends the result back to the master process.

MPI.Finalize()  # Finalize MPI environment

This line finalizes the MPI environment, releasing all resources associated with it.

In this tutorial, we’ve learned how to use the mpi4py library to distribute computation tasks across multiple processes in Python. Specifically, we’ve implemented a master-worker model to calculate the squares of array elements concurrently using multiple processors. This approach can significantly speed up computation for large datasets and computationally intensive tasks.

--

--

Afzal Badshah, PhD

Dr Afzal Badshah focuses on academic skills, pedagogy (teaching skills) and life skills.