# Final project¶

## Due 23.59, Dec/17/2017¶

### Parallel computing II: solving wave equation in 2D with MPI¶

Note1. This is your final assignment bearing 200 points with potentially 100 bonus points.

Note2. If you have not already studied and practiced the material and codes in the sections PDEs and MPI, STOP here! Go back and do your study and practice and then come back and continue with this assignment.

Note 3. It is very important that you follow the rules of thumb in Good citizenship.

Consider the wave equation in the conservative form in two dimensions

$$u_{tt} =(a(x,y)u_x)_x + (a(x,y)u_y)_y + f(x,y,t), \ \ (x,y) \in D =[-1,1]^2, \ \ t \in [0,2],$$

augmented with initial conditions

$$u(x,y,0) = f_1(x,y), \ \ u_t(x,y,0) = f_2(x,y), \ \ \quad (x,y) \in D,$$

and Dirichlet boundary conditions

$$u(x,y,t) = g(x,y,t), \ \ \quad (x,y) \in \partial D.$$

Here, $$a(x,y) = 1+\sin(x) \, \cos(y)$$ is the given wave speed, and $$u=u(x,y,t)$$ is the wave solution to be computed.

#### Part I. Serial computation (70 points)¶

1. Use the method of manufactured solutions (as explained here) and manufacture an initial-boundary value problem for which the true solution is known.
2. Write a Fortran code that implements a second-order accurate finite difference method for solving your manufactured problem by the method of lines (as explained here). You will need to create subroutines for the manufactured forcing, boundary conditions, the Laplacian, abd initial data. You may would like to see the Matlab version of the code, located in the course repository under the directory hpsc2017/Project/Matlab. You can also let nx=ny throughout this project, that is consider the same number of grid points in both x and y directions. This would imply that hx=hy.
3. Verify that your code works correctly by verifying the convergence rate of the maximum approximation error over the domain $$D$$ at time $$t=2$$. You will need to plot (in log-scale) the error versus a decreasing sequence of grid spacing and show that the error decreases with the expected rate, which is 2.

#### Part II. Parallelization (130 points)¶

1. Make your serial Fortran code parallel using MPI library. Employ a one-dimensional parallelization strategy by splitting the computational domain $$D$$ into nprocs vertical slabs. This is a one-dimensional domain decomposition with px_max = npocs processors ranging from the processor number px=1 to px=px_max. The communication between processors will be similar to the 1D wave equation that we have seen in class.
2. Apply your code to the same manufactured problem that you found in part I. Check that your parallel code and domain decomposition and communication are working correctly by checking the convergence of the maximum error in approximate solution at time $$t=2$$ and making sure that the error decays with the expected rate 2. Include some neat plots of the solution at final time and of the error convergence in your report. The easiest way to output in parallel is to simply have each processor write its own part of the solution at the final time and then put it all together in a visualization program (e.g. Matlab).
3. Study the strong and week scaling on Stampede2. Include some neat plots in your report. As always, start small: for example let nx=xy=100 and use 5 nodes. Then gradually increase the problem size and the number of nodes. Do not use more than 50 nodes and more than nx=xy=5000 for this project.

#### Part III. Extra assignments (100 bonus points)¶

1. (70 bonus points) 2D domain decomposition. The above one dimensional domain decomposition does not minimize the communication between sub-domains.

A. You can improve this by splitting the domain into px_max*py_max processors where px_max and py_max are chosen to minimize abs(nx/px_max-ny/py_max).

B. After decomposing the computational domain in a two dimensional way you will need to update the communication. Start by assigning the left, right, top, and bottom neighbors of each process. Be careful abour ghost points: each process will have an array u(0:nxl+1,0:nyl+1) where the physical domain owned by each process corresponds to the indicies 1:nxl,1:nyl.

C. Check that your domain decomposition and communication is working by computing the error in approximate solution.

D. Study the strong and week scalings on Stampede2, and compare with the 1D strategy in Part II.

2. (30 bonus points) Hybrid parallelization. Simply do the do loops on a node using openMP, and use MPI for domain decomposition and communicating between sub-domains. Check scaling and compare.

• As usual, arrange your results neatly in your report and comment on them.