main

::ELIMINATING THE LAUNCHER

The information on this page would be useful for users who want to launch each process in an application manually or want to use their own launcher to start an application.

The Current Launcher

It is possible to launch an MPICH application without using the provided launcher.   First we need to know what the launcher does and then we can show how to launch an application without it.  MPICH.NT uses environment variables to communicate with the spawned processes, so any launcher that can provide the required environment variables could launch an MPICH.NT application.

What the launcher does:

    1) Create the first process

Process zero acquires a port to listen on and then communicates this port number back to the launcher.

    2) Create the rest of the processes

The launcher then creates all the rest of the processes, informing them of which port the first process is listening on through an environment variable.

Here are the environment variables set by the launcher:

Required  
MPICH_JOBID Unique string accross all machines used to create named objects like mutexes and shared memory queues.  I create this string by appending a number to the root hostname (ie. fry14). The launcher uses this value as a key in the registry to store information about running mpich applications
MPICH_IPROC The rank of the current process.
MPICH_NPROC The total number of processes.
MPICH_ROOT The hostname of the root process and the port where it is listening.  Use a colon to separate the host name and port: hostA:port or a.b.c.d:port
MPICH_EXTRA Only valid on the root process.  The name of a temporary file used to communicate the port number from the root process to the launcher.
Conditional  
MPICH_SHM_LOW The lowest rank that the current process can reach through shared memory queues.
MPICH_SHM_HIGH The highest rank the current process can reach through shared memory queues.

Without the Launcher

The key to eliminating the launcher is to remove the interaction with the first process.  If you set MPICH_ROOT to an available port number in the envionment of the first process then the process will use this port and it will not attempt to write the number out to the file described by MPICH_EXTRA.

Here is an example.

I brought up two command prompts on two separate machines, set the environment variables and ran an application according to the charts below:

Host Fry Jazz
Environment MPICH_JOBID=fry.123
MPICH_IPROC=0
MPICH_NPROC=2
MPICH_ROOT=fry:12345
MPICH_JOBID=fry.123
MPICH_IPROC=1
MPICH_NPROC=2
MPICH_ROOT=fry:12345
Command netpipe.exe netpipe.exe

Here is the same example on a single machine which uses shared memory:

Host Fry Fry
Environment MPICH_JOBID=fry.2000
MPICH_IPROC=0
MPICH_NPROC=2
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=0
MPICH_SHM_HIGH=1
MPICH_JOBID=fry.2000
MPICH_IPROC=1
MPICH_NPROC=2
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=0
MPICH_SHM_HIGH=1
Command netpipe.exe netpipe.exe

Here is an example of four processes on two machines which mixes shared memory and socket communication:

Host Fry Fry Jazz Jazz
Environment MPICH_JOBID=fry.100
MPICH_IPROC=0
MPICH_NPROC=4
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=0
MPICH_SHM_HIGH=1
MPICH_JOBID=fry.100
MPICH_IPROC=1
MPICH_NPROC=4
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=0
MPICH_SHM_HIGH=1
MPICH_JOBID=fry.100
MPICH_IPROC=2
MPICH_NPROC=4
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=2
MPICH_SHM_HIGH=3
MPICH_JOBID=fry.100
MPICH_IPROC=3
MPICH_NPROC=4
MPICH_ROOT=fry:12345
MPICH_SHM_LOW=2
MPICH_SHM_HIGH=3
Command mandel.exe mandel.exe mandel.exe mandel.exe

This is the exact process for the first example from a command prompt:

On Fry

C:\Temp>set MPICH_JOBID=fry.123
C:\Temp>set MPICH_IPROC=0
C:\Temp>set MPICH_NPROC=2
C:\Temp>set MPICH_ROOT=fry:12345
C:\Temp>netpipe.exe

On Jazz

C:\Temp>set MPICH_JOBID=fry.123
C:\Temp>set MPICH_IPROC=1
C:\Temp>set MPICH_NPROC=2
C:\Temp>set MPICH_ROOTHOST=fry:12345
C:\Temp>netpipe.exe

main