Intel® MPI Library Reference Manual for Linux* OS
The Intel® MPI Library has a built-in statistics gathering facility that collects essential performance data without disturbing the application execution. The collected information is sent to a text file. This section describes the environment variables used to control the built-in statistics gathering facility, and provides example output files.
Besides using the environment variables, you can also collect the native statistics using the MPI Performance Snapshot through the -mps option. For example:
$ mpirun -mps –n 2 ./myApp
See the description of -mps for more details.
Control statistics collection. Expand values of I_MPI_STATS environment variable additionally to existing values.
n, m |
Possible stats levels of the output information |
1 |
Output the amount of data sent by each process |
2 |
Output the number of calls and amount of transferred data |
3 |
Output statistics combined according to the actual arguments |
4 |
Output statistics defined by a buckets list |
10 |
Output collective operation statistics for all communication contexts |
20 |
Output additional time information for all MPI functions |
Set this environment variable to control the amount of statistics information collected and the output to the log file. No statistics are output by default.
n, m are positive integer numbers. They define the range of output information. The statistics from level n to level m inclusive are output. If an n value is not provided, the default value is 1.
Select the subsystem(s) to collect statistics for.
I_MPI_STATS_SCOPE="<subsystem>[:<ops>][;<subsystem>[:<ops>][...]]"
<subsystem> |
Define the target subsystem(s) |
all |
Collect statistics data for all operations. This is the default value |
coll |
Collect statistics data for all collective operations |
p2p |
Collect statistics data for all point-to-point operations |
<ops> |
Define the target operations as a comma separated list |
Allgather |
|
Iallgather |
MPI_Iallgather |
Allgatherv |
|
Iallgatherv |
MPI_Iallgatherv |
Allreduce |
|
Iallreduce |
MPI_Iallreduce |
Alltoall |
|
Ialltoall |
MPI_Ialltoall |
Alltoallv |
|
Ialltoallv |
MPI_Ialltoallv |
Alltoallw |
|
Ialltoallw |
MPI_Ialltoallw |
Barrier |
|
Ibarrier |
MPI_Ibarrier |
Bcast |
|
Ibcast |
MPI_Ibcast |
Exscan |
|
Iexscan |
MPI_Iexscan |
Gather |
|
Igather |
MPI_Igather |
Gatherv |
|
Igatherv |
MPI_Igatherv |
Reduce_scatter |
MPI_Reduce_scatter |
Ireduce_scatter |
MPI_Ireduce_scatter |
Reduce |
|
Ireduce |
MPI_Ireduce |
Scan |
|
Iscan |
MPI_Iscan |
Scatter |
|
Iscatter |
MPI_Iscatter |
Scatterv |
|
Iscatterv |
MPI_Iscatterv |
Send |
Standard transfers (MPI_Send, MPI_Isend, MPI_Send_init) |
Sendrecv |
Send-receive transfers (MPI_Sendrecv, MPI_Sendrecv_replace) |
Bsend |
Buffered transfers (MPI_Bsend, MPI_Ibsend, MPI_Bsend_init) |
Csend |
Point-to-point operations inside the collectives. This internal operation serves all collectives |
Csendrecv |
Point-to-point send-receive operations inside the collectives. This internal operation serves all collectives |
Rsend |
Ready transfers (MPI_Rsend, MPI_Irsend, MPI_Rsend_init) |
Ssend |
Synchronous transfers (MPI_Ssend, MPI_Issend, MPI_Ssend_init) |
Set this environment variable to select the target subsystem in which to collect statistics. All collective and point-to-point operations, including the point-to-point operations performed inside the collectives, are covered by default.
The default settings are equivalent to:
I_MPI_STATS_SCOPE="coll;p2p"
Use the following settings to collect statistics for the MPI_Bcast, MPI_Reduce, and all point-to-point operations:
I_MPI_STATS_SCOPE="p2p;coll:bcast,reduce"
Use the following settings to collect statistics for the point-to-point operations inside the collectives:
I_MPI_STATS_SCOPE=p2p:csend
Identify a list of ranges for message sizes and communicator sizes that are used for collecting statistics.
I_MPI_STATS_BUCKETS=<msg>[@<proc>][,<msg>[@<proc>]]...
<msg> |
Specify range of message sizes in bytes |
<l> |
Single value of message size |
<l>-<m> |
Range from <l> to <m> |
<proc> |
Specify range of processes (ranks) for collective operations |
<p> |
Single value of communicator size |
<p>-<q> |
Range from <p> to <q> |
Set the I_MPI_STATS_BUCKETS environment variable to define a set of ranges for message sizes and communicator sizes.
Level 4 of the statistics provides profile information for these ranges.
If I_MPI_STATS_BUCKETS environment variable is not used, then level 4 statistics is not gathered.
If a range is not specified, the maximum possible range is assumed.
To specify short messages (from 0 to 1000 bytes) and long messages (from 50000 to 100000 bytes), use the following setting:
-env I_MPI_STATS_BUCKETS 0-1000,50000-100000
To specify messages that have 16 bytes in size and circulate within four process communicators, use the following setting:
-env I_MPI_STATS_BUCKETS "16@4"
When the @ symbol is present, the environment variable value must be enclosed in quotes.
Define the statistics output file name.
I_MPI_STATS_FILE=<name>
<name> |
Define the statistics output file name |
Set this environment variable to define the statistics output file. By default, the stats.txt file is created in the current directory.
If this variable is not set and the statistics output file already exists, an index is appended to its name. For example, if stats.txt exists, the created statistics output file is named as stats(2).txt; if stats(2).txt exists, the created file is named as stats(3).txt, and so on.
The statistics data is blocked and ordered according to the process ranks in the MPI_COMM_WORLD communicator. The timing data is presented in microseconds. For example, with the following settings:
I_MPI_STATS_SCOPE="p2p;coll:allreduce"
The statistics output for a simple program that performs only one MPI_Allreduce operation may look as follows:
Intel(R) MPI Library Version 5.1 ____ MPI Communication Statistics ____ Stats level: 4 P2P scope:< FULL > Collectives scope:< Allreduce > ~~~~ Process 0 of 2 on node svlmpihead01 lifetime = 414.13 Data Transfers Src Dst Amount(MB) Transfers ----------------------------------------- 000 --> 000 0.000000e+00 0 000 --> 001 7.629395e-06 2 ========================================= Totals 7.629395e-06 2 Communication Activity Operation Volume(MB) Calls ----------------------------------------- P2P Csend 7.629395e-06 2 Csendrecv 0.000000e+00 0 Send 0.000000e+00 0 Sendrecv 0.000000e+00 0 Bsend 0.000000e+00 0 Rsend 0.000000e+00 0 Ssend 0.000000e+00 0 Collectives Allreduce 7.629395e-06 2 ========================================= Communication Activity by actual args P2P Operation Dst Message size Calls --------------------------------------------- Csend 1 1 4 2 Collectives Operation Context Algo Comm size Message size Calls Cost(%) ------------------------------------------------------------------------------------- Allreduce 1 0 1 2 4 2 44.96 ============================================================================ ~~~~ Process 1 of 2 on node svlmpihead01 lifetime = 306.13 Data Transfers Src Dst Amount(MB) Transfers ----------------------------------------- 001 --> 000 7.629395e-06 2 001 --> 001 0.000000e+00 0 ========================================= Totals 7.629395e-06 2 Communication Activity Operation Volume(MB) Calls ----------------------------------------- P2P Csend 7.629395e-06 2 Csendrecv 0.000000e+00 0 Send 0.000000e+00 0 Sendrecv 0.000000e+00 0 Bsend 0.000000e+00 0 Rsend 0.000000e+00 0 Ssend 0.000000e+00 0 Collectives Allreduce 7.629395e-06 2 ========================================= Communication Activity by actual args P2P Operation Dst Message size Calls --------------------------------------------- Csend 1 0 4 2 Collectives Operation Context Comm size Message size Calls Cost(%) ------------------------------------------------------------------------ Allreduce 1 0 2 4 2 37.93 ======================================================================== ____ End of stats.txt file ____
In the example above: