# MPI programming using c

# MPI programming using c

Please find the attached file

Document Preview:

Write an MPI program for multiplying two nxn matrices, A and B, with each processor producing a row-band of matrix C. P0 will send row-bands of A and all of B to slaves.

Submit source code and sample timings/plots for varying sized matrices.

Folder:

MPI multiplication program – column of B rotated in a ring topology

Instructions:

B’s column bands are distributed among processors, and rotated in a ring topology.

Submit source code and timing plots varying p = 1, 2, …, 8 and n = 50, 100, 200, 500, etc.

C.

Folder:

Pthread matrix multiplication and reduction programs

Instructions:

Here is a two-part simple pthread-based assignment due next next Thu. You may use the sample prorams posted in c6310/Pthreads directory.

(a) Reimplement your multiplicaion program with each process/thread computing a band of matrix C. No synchronization is needed here, as all three matrices can be allocated in the shared memory (i.e., in the global scope).

Submit code and timing data for various n and p valuaes.

(b) Use a ring pattern to carry out reduction over matrix A. For this, assume the above processor to row-band allocation, and find row sums of matrix A in parallel with output going into the first column of A. Then, have P0 find the column sum of the first column. You will need a barrier.

Submit code and timing data for various n and p valuaes.

Write an MPI program for multiplying two nxn matrices, A and B, with each processor producing a row-band of matrix C. P0 will send row-bands of A and all of B to slaves.

Submit source code and sample timings/plots for varying sized matrices.

Folder:

MPI multiplication program – column of B rotated in a ring topology

Instructions:

B’s column bands are distributed among processors, and rotated in a ring topology.

Submit source code and timing plots varying p = 1, 2, …, 8 and n = 50, 100, 200, 500, etc.

C.

Folder:

Pthread matrix multiplication and reduction programs

Instructions:

Here is a two-part simple pthread-based assignment due next next Thu. You may use the sample prorams posted in c6310/Pthreads directory.

(a) Reimplement your multiplicaion program with each process/thread computing a band of matrix C. No synchronization is needed here, as all three matrices can be allocated in the shared memory (i.e., in the global scope).

Submit code and timing data for various n and p valuaes.

(b) Use a ring pattern to carry out reduction over matrix A. For this, assume the above processor to row-band allocation, and find row sums of matrix A in parallel with output going into the first column of A. Then, have P0 find the column sum of the first column. You will need a barrier.

Submit code and timing data for various n and p valuaes.

Attachments: