MPI Gather Function in Python

Afzal Badshah, PhD
2 min readApr 3, 2024

The gather function is used to gather data from multiple processes into a single process. We’ll go through the provided code, line by line, and understand how the gather function works. The detailed tutorial can be found here.

Code

from mpi4py import MPI
comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
data = (rank + 1)
data = comm.gather(data, root=0)
print(f"Process {rank}: Calculated data = {data}")
if rank == 0:
for i in range(size):
assert data[i] == (i + 1)
print(f"Process 0: Checked data received from process {i + 1}")
else:
assert data is None

Explanation

from mpi4py import MPI

This line imports the MPI functionality from the mpi4py library.

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

These lines initialize the MPI communicator (comm) and obtain the total number of processes (size) and the rank of the current process (rank).

data = (rank + 1)
data = comm.gather(data, root=0)

Each process calculates its own data value based on its rank (rank + 1). Then, the gather function is called on the communicator comm. This function gathers data from all processes and returns it to the root process (rank 0) as a list. In this case, the data from each process is gathered at process 0 (root=0).

print(f"Process {rank}: Calculated data = {data}")

Each process prints its own calculated data value. Since data is different for each process, the output will vary accordingly.

if rank == 0:
for i in range(size):
assert data[i] == (i + 1)
print(f"Process 0: Checked data received from process {i + 1}")

In the root process (rank 0), a loop iterates over all processes (size). It checks whether the received data from each process matches the expected value ((i + 1)). If the data matches, it prints a confirmation message.

else:
assert data is None

In non-root processes (ranks other than 0), it’s ensured that the data variable is None since they do not receive any data through the gather operation.

In this tutorial, we’ve learned how to use the gather function in MPI to collect data from multiple processes into a single process. This is a fundamental operation in parallel programming, often used to gather results from worker processes to a master process for further processing or analysis. Understanding MPI collective operations like gather is essential for developing efficient parallel programs.

--

--

Afzal Badshah, PhD

Dr Afzal Badshah focuses on academic skills, pedagogy (teaching skills) and life skills.