::ELIMINATING THE LAUNCHER
The information on this page would be useful for users who want to launch each process in an application manually or want to use their own launcher to start an application or want to debug an application.
The Provided Launcher (mpd)
It is possible to launch an MPICH application without using the provided launcher. First we need to know what the launcher does and then we can show how to launch an application without it. MPICH.NT uses environment variables to communicate with the spawned processes, so any launcher that can provide the required environment variables could launch an MPICH.NT application.
What the launcher does:
1) Create the first process
Process zero acquires a port to listen on and then communicates this port number back to the launcher.
2) Create the rest of the processes
The launcher then creates all the rest of the processes, informing them which port the first process is listening on through an environment variable.
Here are the environment variables set by the launcher:
Required | |
MPICH_JOBID | Unique string accross all machines used to create named objects like mutexes and shared memory queues. The provided launchers create this string by appending a number to the root hostname (ie. fry14). |
MPICH_IPROC | The rank of the current process. |
MPICH_NPROC | The total number of processes. |
MPICH_ROOT | The hostname of the root process and the port where it is listening. Use a colon to separate the host name and port: hostA:port or a.b.c.d:port |
MPICH_EXTRA | Only valid on the root process. The name of a temporary file used to communicate the port number from the root process to the launcher. |
Conditional | |
MPICH_SHM_LOW | The lowest rank that the current process can reach through shared memory queues. |
MPICH_SHM_HIGH | The highest rank the current process can reach through shared memory queues. |
Without the Launcher
The key to eliminating the launcher is to remove the interaction with the first process. If you set MPICH_ROOT to an available port number in the environment of the first process then the process will use this port and it will not attempt to write the number out to the file described by MPICH_EXTRA.
Here is an example.
I brought up two command prompts on two separate machines, set the environment variables and ran an application according to the charts below:
Host | Fry | Jazz |
Environment | MPICH_JOBID=fry.123 MPICH_IPROC=0 MPICH_NPROC=2 MPICH_ROOT=fry:12345 |
MPICH_JOBID=fry.123 MPICH_IPROC=1 MPICH_NPROC=2 MPICH_ROOT=fry:12345 |
Command | netpipe.exe | netpipe.exe |
Here is the same example on a single machine which uses shared memory:
Host | Fry | Fry |
Environment | MPICH_JOBID=fry.2000 MPICH_IPROC=0 MPICH_NPROC=2 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=0 MPICH_SHM_HIGH=1 |
MPICH_JOBID=fry.2000 MPICH_IPROC=1 MPICH_NPROC=2 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=0 MPICH_SHM_HIGH=1 |
Command | netpipe.exe | netpipe.exe |
Here is an example of four processes on two machines which mixes shared memory and socket communication:
Host | Fry | Fry | Jazz | Jazz |
Environment | MPICH_JOBID=fry.100 MPICH_IPROC=0 MPICH_NPROC=4 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=0 MPICH_SHM_HIGH=1 |
MPICH_JOBID=fry.100 MPICH_IPROC=1 MPICH_NPROC=4 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=0 MPICH_SHM_HIGH=1 |
MPICH_JOBID=fry.100 MPICH_IPROC=2 MPICH_NPROC=4 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=2 MPICH_SHM_HIGH=3 |
MPICH_JOBID=fry.100 MPICH_IPROC=3 MPICH_NPROC=4 MPICH_ROOT=fry:12345 MPICH_SHM_LOW=2 MPICH_SHM_HIGH=3 |
Command | mandel.exe | mandel.exe | mandel.exe | mandel.exe |
This is the exact process for the first example from a command prompt:
On Fry
C:\Temp>set MPICH_JOBID=fry.123
C:\Temp>set MPICH_IPROC=0
C:\Temp>set MPICH_NPROC=2
C:\Temp>set MPICH_ROOT=fry:12345
C:\Temp>netpipe.exe
On Jazz
C:\Temp>set MPICH_JOBID=fry.123
C:\Temp>set MPICH_IPROC=1
C:\Temp>set MPICH_NPROC=2
C:\Temp>set MPICH_ROOT=fry:12345
C:\Temp>netpipe.exe
Debugging
To debug an application using MSDevStudio, simply set up the environment variables as described above and then run "msdev myapp.exe" instead of just "myapp.exe". This will bring up the developer studio and then you can step through your program using the debugger commands.
Note: This will only work with code built with the debug configuration. Also, if you want to step through the mpich dll code, you will have to download and compile the source distribution of mpich.nt.