Intel® MPI Library Reference Manual for Linux* OS
Define the DAPL provider to load.
I_MPI_DAPL_PROVIDER=<name>
<name> |
Define the name of DAPL provider to load |
Set this environment variable to define the name of DAPL provider to load. This name is also defined in the dat.conf configuration file.
Select the DAT library to be used for DAPL* provider.
<library> |
Specify the DAT library for DAPL provider to be used. Default values are libdat.so or libdat.so.1 for DAPL* 1.2 providers and libdat2.so or libdat2.so.2 for DAPL* 2.0 providers |
Set this environment variable to select a specific DAT library to be used for DAPL provider. If the library is not located in the dynamic loader search path, specify the full path to the DAT library. This environment variable affects only on DAPL and DAPL UD capable fabrics.
(I_MPI_RDMA_TRANSLATION_CACHE)
Turn on/off the memory registration cache in the DAPL path.
I_MPI_DAPL_TRANSLATION_CACHE=<arg>
I_MPI_RDMA_TRANSLATION_CACHE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the memory registration cache. This is the default |
disable | no | off | 0 |
Turn off the memory registration cache |
Set this environment variable to turn on/off the memory registration cache in the DAPL path.
The cache substantially increases performance, but may lead to correctness issues in certain situations. See product Release Notes for further details.
I_MPI_DAPL_TRANSLATION_CACHE_AVL_TREE
Enable/disable the AVL tree* based implementation of the RDMA translation cache in the DAPL path.
I_MPI_DAPL_TRANSLATION_CACHE_AVL_TREE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the AVL tree based RDMA translation cache |
disable | no | off | 0 |
Turn off the AVL tree based RDMA translation cache. This is the default value |
Set this environment variable to enable the AVL tree based implementation of RDMA translation cache in the DAPL path. When the search in RDMA translation cache handles over 10,000 elements, the AVL tree based RDMA translation cache is faster than the default implementation.
I_MPI_DAPL_DIRECT_COPY_THRESHOLD
(I_MPI_RDMA_EAGER_THRESHOLD, RDMA_IBA_EAGER_THRESHOLD)
Change the threshold of the DAPL direct-copy protocol.
I_MPI_DAPL_DIRECT_COPY_THRESHOLD=<nbytes>
I_MPI_RDMA_EAGER_THRESHOLD=<nbytes>
RDMA_IBA_EAGER_THRESHOLD=<nbytes>
<nbytes> |
Define the DAPL direct-copy protocol threshold |
> 0 |
The default <nbytes> value depends on the platform |
Set this environment variable to control the DAPL direct-copy protocol threshold. Data transfer algorithms for the DAPL-capable network fabrics are selected based on the following scheme:
Messages shorter than or equal to <nbytes> are sent using the eager protocol through the internal pre-registered buffers. This approach is faster for short messages.
Messages larger than <nbytes> are sent using the direct-copy protocol. It does not use any buffering but involves registration of memory on sender and receiver sides. This approach is faster for large messages.
This environment variable is available for both Intel® and non-Intel microprocessors, but it may perform additional optimizations for Intel microprocessors than it performs for non-Intel microprocessors.
The equivalent of this variable for Intel® Xeon Phi™ Coprocessor is I_MIC_MPI_DAPL_DIRECT_COPY_THRESHOLD
I_MPI_DAPL_EAGER_MESSAGE_AGGREGATION
Control the use of concatenation for adjourned MPI send requests. Adjourned MPI send requests are those that cannot be sent immediately.
I_MPI_DAPL_EAGER_MESSAGE_AGGREGATION=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Enable the concatenation for adjourned MPI send requests |
disable | no | off | 0 |
Disable the concatenation for adjourned MPI send requests. This is the default value |
Set this environment variable to control the use of concatenation for adjourned MPI send requests intended for the same MPI rank. In some cases, this mode can improve the performance of applications, especially when MPI_Isend() is used with short message sizes and the same destination rank, such as:
for( i = 0; i< NMSG; i++)
{ret = MPI_Isend( sbuf[i], MSG_SIZE, datatype, dest , tag, \
comm, &req_send[i]);
}
I_MPI_DAPL_DYNAMIC_CONNECTION_MODE
(I_MPI_DYNAMIC_CONNECTION_MODE, I_MPI_DYNAMIC_CONNECTIONS_MODE)
Choose the algorithm for establishing the DAPL* connections.
I_MPI_DAPL_DYNAMIC_CONNECTION_MODE=<arg>
I_MPI_DYNAMIC_CONNECTION_MODE=<arg>
I_MPI_DYNAMIC_CONNECTIONS_MODE=<arg>
<arg> |
Mode selector |
reject |
Deny one of the two simultaneous connection requests. This is the default |
disconnect |
Deny one of the two simultaneous connection requests after both connections have been established |
Set this environment variable to choose the algorithm for handling dynamically established connections for DAPL-capable fabrics according to the following scheme:
In the reject mode, if two processes initiate the connection simultaneously, one of the requests is rejected.
In the disconnect mode, both connections are established, but then one is disconnected. The disconnect mode is provided to avoid a bug in certain DAPL* providers.
(I_MPI_RDMA_SCALABLE_PROGRESS)
Turn on/off scalable algorithm for DAPL read progress.
I_MPI_DAPL_SCALABLE_PROGRESS=<arg>
I_MPI_RDMA_SCALABLE_PROGRESS=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on scalable algorithm. When the number of processes is larger than 128, this is the default value |
disable | no | off | 0 |
Turn off scalable algorithm. When the number of processes is less than or equal to 128, this is the default value |
Set this environment variable to enable scalable algorithm for the DAPL read progress. In some cases, this provides advantages for systems with many processes.
(I_MPI_RDMA_BUFFER_NUM, NUM_RDMA_BUFFER)
Change the number of internal pre-registered buffers for each process pair in the DAPL path.
I_MPI_DAPL_BUFFER_NUM=<nbuf>
I_MPI_RDMA_BUFFER_NUM=<nbuf>
NUM_RDMA_BUFFER=<nbuf>
<nbuf> |
Define the number of buffers for each pair in a process group |
> 0 |
The default value depends on the platform |
Set this environment variable to change the number of the internal pre-registered buffers for each process pair in the DAPL path.
The more pre-registered buffers are available, the more memory is used for every established connection.
(I_MPI_RDMA_BUFFER_SIZE, I_MPI_RDMA_VBUF_TOTAL_SIZE)
Change the size of internal pre-registered buffers for each process pair in the DAPL path.
I_MPI_DAPL_BUFFER_SIZE=<nbytes>
I_MPI_RDMA_BUFFER_SIZE=<nbytes>
I_MPI_RDMA_VBUF_TOTAL_SIZE=<nbytes>
<nbytes> |
Define the size of pre-registered buffers |
> 0 |
The default value depends on the platform |
Set this environment variable to define the size of the internal pre-registered buffer for each process pair in the DAPL path. The actual size is calculated by adjusting the <nbytes> to align the buffer to an optimal value.
I_MPI_DAPL_RNDV_BUFFER_ALIGNMENT
(I_MPI_RDMA_RNDV_BUFFER_ALIGNMENT, I_MPI_RDMA_RNDV_BUF_ALIGN)
Define the alignment of the sending buffer for the DAPL direct-copy transfers.
I_MPI_DAPL_RNDV_BUFFER_ALIGNMENT=<arg>
I_MPI_RDMA_RNDV_BUFFER_ALIGNMENT=<arg>
I_MPI_RDMA_RNDV_BUF_ALIGN=<arg>
<arg> |
Define the alignment for the sending buffer |
> 0 and a power of 2 |
The default value is 64 |
Set this environment variable to define the alignment of the sending buffer for DAPL direct-copy transfers. When a buffer specified in a DAPL operation is aligned to an optimal value, the data transfer bandwidth may be increased.
(I_MPI_RDMA_RNDV_WRITE, I_MPI_USE_RENDEZVOUS_RDMA_WRITE)
Turn on/off the RDMA Write-based rendezvous direct-copy protocol in the DAPL path.
I_MPI_DAPL_RDMA_RNDV_WRITE=<arg>
I_MPI_RDMA_RNDV_WRITE=<arg>
I_MPI_USE_RENDEZVOUS_RDMA_WRITE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the RDMA Write rendezvous direct-copy protocol |
disable | no | off | 0 |
Turn off the RDMA Write rendezvous direct-copy protocol |
Set this environment variable to select the RDMA Write-based rendezvous direct-copy protocol in the DAPL path. Certain DAPL* providers have a slow RDMA Read implementation on certain platforms. Switching on the rendezvous direct-copy protocol based on the RDMA Write operation can increase performance in these cases. The default value depends on the DAPL provider attributes.
I_MPI_DAPL_CHECK_MAX_RDMA_SIZE
(I_MPI_RDMA_CHECK_MAX_RDMA_SIZE)
Check the value of the DAPL attribute, max_rdma_size.
I_MPI_DAPL_CHECK_MAX_RDMA_SIZE=<arg>
I_MPI_RDMA_CHECK_MAX_RDMA_SIZE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Check the value of the DAPL* attribute max_rdma_size |
disable | no | off | 0 |
Do not check the value of the DAPL* attribute max_rdma_size. This is the default value |
Set this environment variable to control message fragmentation according to the following scheme:
If this mode is enabled, the Intel® MPI Library fragmentizes the messages bigger than the value of the DAPL attribute max_rdma_size
If this mode is disabled, the Intel® MPI Library does not take into account the value of the DAPL attribute max_rdma_size for message fragmentation
Control message fragmentation threshold.
I_MPI_DAPL_MAX_MSG_SIZE=<nbytes>
I_MPI_RDMA_MAX_MSG_SIZE=<nbytes>
Set this environment variable to control message fragmentation size according to the following scheme:
If the I_MPI_DAPL_CHECK_MAX_RDMA_SIZE environment variable is set to disable, the Intel® MPI Library fragmentizes the messages whose sizes are greater than <nbytes>.
If the I_MPI_DAPL_CHECK_MAX_RDMA_SIZE environment variable is set to enable, the Intel® MPI Library fragmentizes the messages whose sizes are greater than the minimum of <nbytes> and the max_rdma_size DAPL* attribute value.
(I_MPI_RDMA_CONN_EVD_SIZE, I_MPI_CONN_EVD_QLEN)
Define the event queue size of the DAPL event dispatcher for connections.
I_MPI_DAPL_CONN_EVD_SIZE=<size>
I_MPI_RDMA_CONN_EVD_SIZE=<size>
I_MPI_CONN_EVD_QLEN=<size>
<size> |
Define the length of the event queue |
> 0 |
The default value is 2*number of processes + 32 in the MPI job |
Set this environment variable to define the event queue size of the DAPL event dispatcher that handles connection related events. If this environment variable is set, the minimum value between <size> and the value obtained from the provider is used as the size of the event queue. The provider is required to supply a queue size that equal or larger than the calculated value.
Change the threshold of switching send/recv to rdma path for DAPL wait mode.
I_MPI_DAPL_SR_THRESHOLD=<arg>
<nbytes> |
Define the message size threshold of switching send/recv to rdma |
>= 0 |
The default <nbytes> value is 256 bytes |
Set this environment variable to control the protocol used for point-to-point communication in DAPL wait mode:
Messages shorter than or equal in size to <nbytes> are sent using DAPL send/recv data transfer operations.
Messages greater in size than <nbytes> are sent using DAPL RDMA WRITE or RDMA WRITE immediate data transfer operations.
Change the number of internal pre-registered buffers for each process pair used in DAPL wait mode for send/recv path.
I_MPI_DAPL_SR_BUF_NUM=<nbuf>
<nbuf> |
Define the number of send/recv buffers for each pair in a process group |
> 0 |
The default value is 32 |
Set this environment variable to change the number of the internal send/recv pre-registered buffers for each process pair.
Enable/disable RDMA Write with immediate data InfiniBand (IB) extension in DAPL wait mode.
I_MPI_DAPL_RDMA_WRITE_IMM=<arg>
I_MPI_RDMA_WRITE_IMM=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on RDMA Write with immediate data IB extension |
disable | no | off | 0 |
Turn off RDMA Write with immediate data IB extension |
Set this environment variable to utilize RDMA Write with immediate data IB extension. The algorithm is enabled if this environment variable is set and a certain DAPL provider attribute indicates that RDMA Write with immediate data IB extension is supported.
I_MPI_DAPL_DESIRED_STATIC_CONNECTIONS_NUM
Define the number of processes that establish DAPL static connections at the same time.
I_MPI_DAPL_DESIRED_STATIC_CONNECTIONS_NUM=<num_procesess>
<num_procesess> |
Define the number of processes that establish DAPL static connections at the same time |
> 0 |
The default <num_procesess> value is equal to 256 |
Set this environment variable to control the algorithm of DAPL static connection establishment.
If the number of processes in the MPI job is less than or equal to <num_procesess>, all MPI processes establish the static connections simultaneously. Otherwise, the processes are distributed into several groups. The number of processes in each group is calculated to be close to <num_procesess>. Then static connections are established in several iterations, including intergroup connection setup.
I_MPI_CHECK_DAPL_PROVIDER_COMPATIBILITY
Enable/disable the check that the same DAPL provider is selected by all ranks.
I_MPI_CHECK_DAPL_PROVIDER_COMPATIBILITY=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the check that the DAPL provider is the same on all ranks. This is default value |
disable | no | off | 0 |
Turn off the check that the DAPL provider is the same on all ranks |
Set this variable to make a check if the DAPL provider is selected by all MPI ranks. If this check is enabled, Intel® MPI Library checks the name of DAPL provider and the version of DAPL. If these parameters are not the same on all ranks, Intel MPI Library does not select the RDMA path and may fall to sockets. Turning off the check reduces the execution time of MPI_Init(). It may be significant for MPI jobs with a large number of processes.