Intel® MPI Library Reference Manual for Linux* OS
Select the particular network fabrics to be used.
I_MPI_FABRICS=<fabric>|<intra-node fabric>:<inter-nodes fabric>
Where <fabric> := {shm, dapl, tcp, tmi, ofa, ofi}
<intra-node fabric> := {shm, dapl, tcp, tmi, ofa, ofi}
<inter-nodes fabric> := {dapl, tcp, tmi, ofa, ofi}
I_MPI_DEVICE=<device>[:<provider>]
<fabric> |
Define a network fabric |
shm |
Shared-memory |
dapl |
DAPL-capable network fabrics, such as InfiniBand*, iWarp*, Dolphin*, and XPMEM* (through DAPL*) |
tcp |
TCP/IP-capable network fabrics, such as Ethernet and InfiniBand* (through IPoIB*) |
tmi |
TMI-capable network fabrics including Intel® True Scale Fabric, Myrinet*, (through Tag Matching Interface) |
ofa |
OFA-capable network fabric including InfiniBand* (through OFED* verbs) |
ofi |
OFI (OpenFabrics Interfaces*)-capable network fabric including Intel® True Scale Fabric, and TCP (through OFI* API) |
Use the <provider> specification only for the {rdma,rdssm} devices.
For example, to select the OFED* InfiniBand* device, use the following command:
$ mpiexec -n <# of processes> \
-env I_MPI_DEVICE rdssm:OpenIB-cma <executable>
For these devices, if <provider> is not specified, the first DAPL* provider in the /etc/dat.conf file is used.
Set this environment variable to select a specific fabric combination. If the requested fabric(s) is not available, Intel® MPI Library can fall back to other fabric(s). See I_MPI_FALLBACK for details. If the I_MPI_FABRICS environment variable is not defined, Intel® MPI Library selects the most appropriate fabric combination automatically.
The exact combination of fabrics depends on the number of processes started per node.
If all processes start on one node, the library uses shm intra-node communication.
If the number of started processes is less than or equal to the number of available nodes, the library uses the first available fabric from the fabrics list for inter-node communication.
For other cases, the library uses shm for intra-node communication, and the first available fabric from the fabrics list for inter-node communication. See I_MPI_FABRICS_LIST for details.
The shm fabric is available for both Intel® and non-Intel microprocessors, but it may perform additional optimizations for Intel microprocessors than it performs for non-Intel microprocessors.
The combination of selected fabrics ensures that the job runs, but this combination may not provide the highest possible performance for the given cluster configuration.
For example, to select shared-memory as the chosen fabric, use the following command:
$ mpirun -n <# of processes> -env I_MPI_FABRICS shm <executable>
To select shared-memory and DAPL-capable network fabric as the chosen fabric combination, use the following command:
$ mpirun -n <# of processes> -env I_MPI_FABRICS shm:dapl <executable>
To enable Intel® MPI Library to select most appropriate fabric combination automatically, use the following command:
$ mpirun -n <# of procs> -perhost <# of procs per host> <executable>
Set the level of debug information to 2 or higher to check which fabrics have been initialized. See I_MPI_DEBUG for details. For example:
[0] MPI startup(): shm and dapl data transfer modes
or
[0] MPI startup(): tcp data transfer mode
Define a fabrics list.
I_MPI_FABRICS_LIST=<fabrics list>
Where <fabrics list> := <fabric>,...,<fabric>
<fabric> := {dapl, tcp, tmi, ofa, ofi}
Set this environment variable to define a list of fabrics. The library uses the fabrics list to choose the most appropriate fabrics combination automatically. For more information on fabric combination, see I_MPI_FABRICS.
For example, if I_MPI_FABRICS_LIST=dapl, tcp, and I_MPI_FABRICS is not defined, and the initialization of DAPL-capable network fabrics fails, the library falls back to TCP-capable network fabric. For more information on fallback, see I_MPI_FALLBACK.
Set this environment variable to enable fallback to the first available fabric.
Set this environment variable to control fallback to the first available fabric.
If you set I_MPI_FALLBACK to enable and an attempt to initialize a specified fabric fails, the library uses the first available fabric from the list of fabrics. See I_MPI_FABRICS_LIST for details.
If you set I_MPI_FALLBACK to disable and an attempt to initialize a specified fabric fails, the library terminates the MPI job.
If you set I_MPI_FABRICS and I_MPI_FALLBACK=enable, the library falls back to fabrics with higher numbers in the fabrics list. For example, if I_MPI_FABRICS=dapl, I_MPI_FABRICS_LIST=ofa,tmi,dapl,tcp, I_MPI_FALLBACK=enable and the initialization of DAPL-capable network fabrics fails, the library falls back to TCP-capable network fabric.
Change the threshold for enabling scalable optimizations.
I_MPI_LARGE_SCALE_THRESHOLD=<arg>
<nprocs> |
Define the scale threshold |
> 0 |
The default value is 4096 |
This variable defines the number of processes when the DAPL UD IB extension is turned on automatically.
Change the eager/rendezvous message size threshold for all devices.
I_MPI_EAGER_THRESHOLD=<nbytes>
<nbytes> |
Set the eager/rendezvous message size threshold |
> 0 |
The default <nbytes> value is equal to 262144 bytes |
Set this environment variable to control the protocol used for point-to-point communication:
Messages shorter than or equal in size to <nbytes> are sent using the eager protocol.
Messages larger than <nbytes> are sent using the rendezvous protocol. The rendezvous protocol uses memory more efficiently.
I_MPI_INTRANODE_EAGER_THRESHOLD
Change the eager/rendezvous message size threshold for intra-node communication mode.
I_MPI_INTRANODE_EAGER_THRESHOLD=<nbytes>
Set this environment variable to change the protocol used for communication within the node:
Messages shorter than or equal in size to <nbytes> are sent using the eager protocol.
Messages larger than <nbytes> are sent using the rendezvous protocol. The rendezvous protocol uses the memory more efficiently.
If you do not set I_MPI_INTRANODE_EAGER_THRESHOLD, the value of I_MPI_EAGER_THRESHOLD is used.
Control the spin count value.
<scount> |
Define the loop spin count when polling fabric(s) |
> 0 |
The default <scount> value is equal to 1 when more than one process runs per processor/core. Otherwise the value equals 250.The maximum value is equal to 2147483647 |
Set the spin count limit. The loop for polling the fabric(s) spins <scount> times before the library releases the processes if no incoming messages are received for processing. Within every spin loop, the shm fabric (if enabled) is polled an extra I_MPI_SHM_SPIN_COUNT times. Smaller values for <scount> cause the Intel® MPI Library to release the processor more frequently.
Use the I_MPI_SPIN_COUNT environment variable for tuning application performance. The best value for <scount> can be chosen on an experimental basis. It depends on the particular computational environment and the application.
Turn on/off scalable optimization of the network fabric communication.
I_MPI_SCALABLE_OPTIMIZATION=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on scalable optimization of the network fabric communication. This is the default for 16 or more processes |
disable | no | off | 0 |
Turn off scalable optimization of the network fabric communication. This is the default value for less than 16 processes |
Set this environment variable to enable scalable optimization of the network fabric communication. In most cases, using optimization decreases latency and increases bandwidth for a large number of processes.
Turn on/off wait mode.
I_MPI_WAIT_MODE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the wait mode |
disable | no | off | 0 |
Turn off the wait mode. This is the default |
Set this environment variable to control the wait mode. If you enable this mode, the processes wait for receiving messages without polling the fabric(s). This mode can save CPU time for other tasks.
Use the Native POSIX Thread Library* with the wait mode for shm communications.
To check which version of the thread library is installed, use the following command:
$ getconf GNU_LIBPTHREAD_VERSION
(I_MPI_USE_DYNAMIC_CONNECTIONS)
Control the dynamic connection establishment.
I_MPI_DYNAMIC_CONNECTION=<arg>
I_MPI_USE_DYNAMIC_CONNECTIONS=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the dynamic connection establishment. This is the default for 64 or more processes |
disable | no | off | 0 |
Turn off the dynamic connection establishment. This is the default for less than 64 processes |
Set this environment variable to control dynamic connection establishment.
If this mode is enabled, all connections are established at the time of the first communication between each pair of processes.
If this mode is disabled, all connections are established upfront.
The default value depends on the number of processes in the MPI job. The dynamic connection establishment is off if the total number of processes is less than 64.