Intel® MPI Library Reference Manual for Linux* OS
Control the message transfer algorithm for the shared memory.
I_MPI_CACHE_BYPASS=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Enable message transfer bypass cache. This is the default value |
disable| no | off | 0 |
Disable message transfer bypass cache |
Set this environment variable to enable/disable message transfer bypass cache for the shared memory. When you enable this feature, the MPI sends the messages greater than or equal in size to the value specified by the I_MPI_SHM_CACHE_BYPASS_THRESHOLD environment variable through the bypass cache. This feature is enabled by default.
I_MPI_SHM_CACHE_BYPASS_THRESHOLDS
(I_MPI_CACHE_BYPASS_THRESHOLDS)
Set the message copying algorithm threshold.
I_MPI_SHM_CACHE_BYPASS_THRESHOLDS=<nb_send>,<nb_recv>[,<nb_send_pk>,<nb_recv_pk>]
I_MPI_CACHE_BYPASS_THRESHOLDS=<nb_send>,<nb_recv>[,<nb_send_pk>,<nb_recv_pk>]
<nb_send> |
Set the threshold for sent messages in the following situations:
|
<nb_recv> |
Set the threshold for received messages in the following situations:
|
<nb_send_pk> |
Set the threshold for sent messages when processes are pinned on cores located in the same physical processor package |
<nb_recv_pk> |
Set the threshold for received messages when processes are pinned on cores located in the same physical processor package |
Set this environment variable to control the thresholds for the message copying algorithm. Intel® MPI Library uses different message copying implementations which are optimized to operate with different memory hierarchy levels. Intel® MPI Library copies messages greater than or equal in size to the defined threshold value using copying algorithm optimized for far memory access. The value of -1 disables using of those algorithms. The default values depend on the architecture and may vary among the Intel® MPI Library versions. This environment variable is valid only when I_MPI_SHM_CACHE_BYPASS is enabled.
This environment variable is available for both Intel and non-Intel microprocessors, but it may perform additional optimizations for Intel microprocessors than it performs for non-Intel microprocessors.
Control the usage of the shared memory fast-boxes.
I_MPI_SHM_FBOX=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on fast box usage. This is the default value. |
disable | no | off | 0 |
Turn off fast box usage. |
Set this environment variable to control the usage of fast-boxes. Each pair of MPI processes on the same computing node has two shared memory fast-boxes, for sending and receiving eager messages.
Turn off the usage of fast-boxes to avoid the overhead of message synchronization when the application uses mass transfer of short non-blocking messages.
Set the size of the shared memory fast-boxes.
I_MPI_SHM_FBOX_SIZE=<nbytes>
<nbytes> |
The size of shared memory fast-boxes in bytes |
> 0 |
The default <nbytes> value depends on the specific platform you use. The value range is from 8K to 64K typically. |
Set this environment variable to define the size of shared memory fast-boxes.
Change the number of cells in the shared memory receiving queue.
I_MPI_SHM_CELL_NUM=<num>
<num> |
The number of shared memory cells |
> 0 |
The default value is 128 |
Set this environment variable to define the number of cells in the shared memory receive queue. Each MPI process has own shared memory receive queue, where other processes put eager messages. The queue is used when shared memory fast-boxes are blocked by another MPI request.
Change the size of a shared memory cell.
I_MPI_SHM_CELL_SIZE=<nbytes>
<nbytes> |
The size of a shared memory cell in bytes |
> 0 |
The default <nbytes> value depends on the specific platform you use. The value range is from 8K to 64K typically. |
Set this environment variable to define the size of shared memory cells.
If you set this environment variable, I_MPI_INTRANODE_EAGER_THRESHOLD is also changed and becomes equal to the given value.
Control the usage of large message transfer (LMT) mechanism for the shared memory.
I_MPI_SHM_LMT=<arg>
<arg> |
Binary indicator |
shm |
Turn on the shared memory copy LMT mechanism. |
direct |
Turn on the direct copy LMT mechanism. This is the default value |
disable | no | off | 0 |
Set this environment variable to control the usage of the large message transfer (LMT) mechanism. To transfer rendezvous messages, you can use the LMT mechanism by employing either of the following implementations:
Use intermediate shared memory queues to send messages.
Use direct copy mechanism that transfers messages without intermediate buffer if the Linux* kernel is higher than version 3.2 which supports the cross memory attach (CMA) feature. If you set the I_MPI_SHM_LMT environment variable to direct, but the operating system does not support CMA, then the shm LTM mechanism runs.
Change the number of shared memory buffers for the large message transfer (LMT) mechanism.
I_MPI_SHM_LMT_BUFFER_NUM=<num>
I_MPI_SHM_NUM_BUFFERS=<num>
<num> |
The number of shared memory buffers for each process pair |
> 0 |
The default value is 8 |
Set this environment variable to define the number of shared memory buffers between each process pair.
Change the size of shared memory buffers for the LMT mechanism.
I_MPI_SHM_LMT_BUFFER_SIZE=<nbytes>
I_MPI_SHM_BUFFER_SIZE=<nbytes>
<nbytes> |
The size of shared memory buffers in bytes |
> 0 |
The default <nbytes> value is equal to 32768 bytes |
Set this environment variable to define the size of shared memory buffers for each pair of processes.
Control the usage of the scalable shared memory mechanism.
I_MPI_SSHM =<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the usage of this mechanism |
disable | no | off | 0 |
Turn off the usage of this mechanism. This is the default value |
Set this environment variable to control the usage of an alternative shared memory mechanism. This mechanism replaces the shared memory fast-boxes, receive queues and LMT mechanism.
If you set this environment variable, the I_MPI_INTRANODE_EAGER_THRESHOLD environment variable is changed and becomes equal to 262,144 bytes.
Change the number of shared memory buffers for the alternative shared memory mechanism.
I_MPI_SSHM_BUFFER_NUM=<num>
<num> |
The number of shared memory buffers for each process pair |
> 0 |
The default value is 4 |
Set this environment variable to define the number of shared memory buffers between each process pair.
Change the size of shared memory buffers for the alternative shared memory mechanism.
I_MPI_SSHM_BUFFER_SIZE=<nbytes>
<nbytes> |
The size of shared memory buffers in bytes |
> 0 |
The default <nbytes> value depends on the specific platform you use. The value range is from 8K to 64K typically. |
Set this environment variable to define the size of shared memory buffers for each pair of processes.
Control the dynamic connection establishment for the alternative shared memory mechanism.
I_MPI_SSHM_DYNAMIC_CONNECTION=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the dynamic connection establishment |
disable | no | off | 0 |
Turn off the dynamic connection establishment. This is the default value |
Set this environment variable to control the dynamic connection establishment.
If this mode is enabled, all connections are established at the time of the first communication between each pair of processes.
If this mode is disabled, all connections are established upfront.
(I_MPI_INTRANODE_SHMEM_BYPASS, I_MPI_USE_DAPL_INTRANODE)
Turn on/off the intra-node communication mode through network fabric along with shm.
I_MPI_SHM_BYPASS=<arg>
I_MPI_INTRANODE_SHMEM_BYPASS=<arg>
I_MPI_USE_DAPL_INTRANODE=<arg>
<arg> |
Binary indicator |
enable | yes | on | 1 |
Turn on the intra-node communication through network fabric |
disable | no | off | 0 |
Turn off the intra-node communication through network fabric. This is the default |
Set this environment variable to specify the communication mode within the node. If the intra-node communication mode through network fabric is enabled, data transfer algorithms are selected according to the following scheme:
Messages shorter than or equal in size to the threshold value of the I_MPI_INTRANODE_EAGER_THRESHOLD environment variable are transferred using shared memory.
Messages larger than the threshold value of the I_MPI_INTRANODE_EAGER_THRESHOLD environment variable are transferred through the network fabric layer.
This environment variable is applicable only when you turn on shared memory and a network fabric either by default or by setting the I_MPI_FABRICS environment variable to shm:<fabric> or an equivalent I_MPI_DEVICE setting. This mode is available only for dapl and tcp fabrics.
Control the spin count value for the shared memory fabric.
I_MPI_SHM_SPIN_COUNT=<shm_scount>
<scount> |
Define the spin count of the loop when polling the shm fabric |
> 0 |
When internode communication uses the tcp fabric, the default <shm_scount> value is equal to 100 spins When internode communication uses the ofa, tmi,ofi or dapl fabric, the default <shm_scount> value is equal to 10 spins. The maximum value is equal to 2147483647 |
Set the spin count limit of the shared memory fabric to increase the frequency of polling. This configuration allows polling of the shm fabric <shm_scount> times before the control is passed to the overall network fabric polling mechanism.
To tune application performance, use the I_MPI_SHM_SPIN_COUNT environment variable. The best value for <shm_scount> can be chosen on an experimental basis. It depends largely on the application and the particular computation environment. An increase in the <shm_scount> value benefits multi-core platforms when the application uses topological algorithms for message passing.