Amazon Elastic Compute Cloud

This is a summary to steup a cluster on AWS EC2. In short, the key points are

Make sure pbdMPI is reinstalled with openMPI correctly on all nodes.
Make sure firewalls of all nodes are open to all for sending to and receiving from all.
Make sure all nodes can access the same R file.

Very Long Steps:

For example, I have two instances of t2.micro running on us-west-2a. Both instances run Ubuntu image provided by AWS EC2. They have public IPs 54.68.48.164 and 54.68.36.71, while they also have private DNSs ip-172-31-42-235 and ip-172-31-40-207, respectively. All instances are in a default security group. The next figures show what they should be.
All instances are changed in default security groups (under NETWORK & SECURITY on the left hand side menu) where I set both Inbund and Outbund rules to
- Type: All traffic
- Protocal: All
- Port Range: 0 - 65536
- Source/Destination: Anywhere
  This is a risky change. It is suggested to allow private IPs only. Please consulting with Network Professional.
  This is make sure MPI can access among instances and via whatever ports are needed. The next figures show what they should be.
- Disable all other network interfaces. Keep routing path as simple as possible. e.g. One may do below in Ubuntu/Xubuntu
```
sudo ifconfig enp0s3 down
sudo ifconfig docker0 down
```
  where enp0s3 and docker0 are the network interface names.
- Disable firewalls if possible. e.g. One may do below in Ubuntu/Xubuntu
```
SHELL> sudo ufw disable
```
All instances should have files
- ~/.ssh/authorized_keys
- ~/.ssh/id_rsa
- ~/.ssh/known_hosts that allows both instances can ssh and login with each other without typing further user name and password. Note that MPI is in a batch mode which does not allowing any interruption. The next figure shows what it should be.
  
  Note that known_hosts can be generated via
```
SHELL> ssh ip-172-31-42-235
SHELL> ssh ip-172-31-40-207
```
  Add more instances if you have more. Note that all instances should have access to all other instances. You may need some shell scripts to help with this setup. Or, you may change /etc/ssh/ssh_config to avoid checking and this step.

(Optional) All instances should have the file ~/work-my/00_set_devel_R containing

export MAKE="/usr/bin/make -j 4"
export R_DEVEL=/home/ubuntu/work-my/local/R-devel
export OMPI=/home/ubuntu/work-my/local/ompi

export PATH=$R_DEVEL/bin:$OMPI/bin:$PATH
export LD_LIBRARY_PATH=$R_DEVEL/lib/R/lib:$OMPI/lib:$LD_LIBRARY_PATH

alias mpiexec=/home/ubuntu/work-my/local/ompi/bin/mpiexec
alias mpirun=/home/ubuntu/work-my/local/ompi/bin/mpirun
alias Rscript=/home/ubuntu/work-my/local/R-devel/bin/Rscript

You may adjust the path according to the system. This is to set the executable and library paths to R and OpenMPI. The alias is to avoid typing full path to mpiexec or mpirun.

(Optional) All instances should have the following line in the file ~/.bashrc
```
. /home/ubuntu/work-my/00_set_devel_R
```
This is to make sure the working environment is consistent on every instance after ssh login.

All instances should install OpenMPI, R, and pbdMPI and their dependences as next.

SHELL> sudo apt-get update
SHELL> sudo apt-get install libopenmpi-dev openmpi-bin
SHELL> sudo apt-get install r-base
SHELL> sudo R CMD INSTALL rlecuyer_0.3-3.tar.gz
SHELL> sudo R CMD INSTALL pbdMPI_0.2-5.tar.gz

Login instance, say ip-172-31-42-235 should have the following lines in the file ~/hostfile to let MPI knows which machines are available to launch applications either in SPMD.
```
ip-172-31-42-235
ip-172-31-40-207
```
Add more instances if you have more and see OpenMPI website for more examples of this file.

In the login instance, you may test hostname, R, and pbdMPI in 4 processors as next.

SHELL> mpiexec -hostfile ~/hostfile -np 4 hostname
SHELL> mpiexec -hostfile ~/hostfile -np 4 \
       Rscript -e "Sys.info()['nodename']" 
SHELL> mpiexec -hostfile ~/hostfile -np 4 \
       Rscript -e "library(pbdMPI,quietly=T);init();comm.rank();finalize()"

Note: Full paths to the mpiexec and Rscript may be needed.

When everything is right, the outputs should be as next if all setups are correct.