Intel® MPI Library Reference Manual for Linux* OS
The Intel® MPI Library supports integrated performance monitoring (IPM) summary format as part of the built-in statistics gathering mechanism described above. You do not need to modify the source code or re-link your application to collect this information.
The I_MPI_STATS_BUCKETS environment variable is not applicable to the IPM format. The I_MPI_STATS_ACCURACY environment variable is available to control extra functionality.
Control the statistics data output format.
I_MPI_STATS=<level>
<level> |
Level of statistics data |
ipm |
Summary data throughout all regions |
ipm:terse |
Basic summary data |
Set this environment variable to ipm to get the statistics output that contains region summary. Set this environment variable to ipm:terse argument to get the brief statistics output.
Define the output file name.
I_MPI_STATS_FILE=<name>
<name> |
File name for statistics data gathering |
Set this environment variable to change the statistics output file name from the default name of stats.ipm.
If this variable is not set and the statistics output file already exists, an index is appended to its name. For example, if stats.ipm exists, the created statistics output file is named as stats(2).ipm; if stats(2).ipm exists, the created file is named as stats(3).ipm, and so on.
Define a semicolon separated list of subsets of MPI functions for statistics gathering.
I_MPI_STATS_SCOPE="<subset>[;<subset>[;…]]"
<subset> |
Target subset |
all2all |
Collect statistics data for all-to-all functions types |
all2one |
Collect statistics data for all-to-one functions types |
attr |
Collect statistics data for attribute control functions |
comm |
Collect statistics data for communicator control functions |
err |
Collect statistics data for error handling functions |
group |
Collect statistics data for group support functions |
init |
Collect statistics data for initialize/finalize functions |
io |
Collect statistics data for input/output support function |
one2all |
Collect statistics data for one-to-all functions types |
recv |
Collect statistics data for receive functions |
req |
Collect statistics data for request support functions |
rma |
Collect statistics data for one sided communication functions |
scan |
Collect statistics data for scan collective functions |
send |
Collect statistics data for send functions |
sendrecv |
Collect statistics data for send/receive functions |
serv |
Collect statistics data for additional service functions |
spawn |
Collect statistics data for dynamic process functions |
status |
Collect statistics data for status control function |
sync |
Collect statistics data for barrier synchronization |
time |
Collect statistics data for timing support functions |
topo |
Collect statistics data for topology support functions |
type |
Collect statistics data for data type support functions |
Use this environment variable to define a subset or subsets of MPI functions for statistics gathering specified by the following table. A union of all subsets is used by default.
Table 3.5-1 Stats Subsets of MPI Functions
all2all MPI_Allgather MPI_Allgatherv MPI_Allreduce MPI_Alltoll MPI_Alltoallv MPI_Alltoallw MPI_Reduce_scatter MPI_Iallgather MPI_Iallgatherv MPI_Iallreduce MPI_Ialltoll MPI_Ialltoallv MPI_Ialltoallw MPI_Ireduce_scatter MPI_Ireduce_scatter_block all2one MPI_Gather MPI_Gatherv MPI_Reduce MPI_Igather MPI_Igatherv MPI_Ireduce attr MPI_Comm_create_keyval MPI_Comm_delete_attr MPI_Comm_free_keyval MPI_Comm_get_attr MPI_Comm_set_attr MPI_Comm_get_name MPI_Comm_set_name MPI_Type_create_keyval MPI_Type_delete_attr MPI_Type_free_keyval MPI_Type_get_attr MPI_Type_get_name MPI_Type_set_attr MPI_Type_set_name MPI_Win_create_keyval MPI_Win_delete_attr MPI_Win_free_keyval MPI_Win_get_attr MPI_Win_get_name MPI_Win_set_attr MPI_Win_set_name MPI_Get_processor_name comm MPI_Comm_compare MPI_Comm_create MPI_Comm_dup MPI_Comm_free MPI_Comm_get_name MPI_Comm_group MPI_Comm_rank MPI_Comm_remote_group MPI_Comm_remote_size MPI_Comm_set_name MPI_Comm_size MPI_Comm_split MPI_Comm_test_inter MPI_Intercomm_create MPI_Intercomm_merge err MPI_Add_error_class MPI_Add_error_code MPI_Add_error_string MPI_Comm_call_errhandler MPI_Comm_create_errhandler MPI_Comm_get_errhandler MPI_Comm_set_errhandler MPI_Errhandler_free MPI_Error_class MPI_Error_string MPI_File_call_errhandler MPI_File_create_errhandler MPI_File_get_errhandler MPI_File_set_errhandler MPI_Win_call_errhandler MPI_Win_create_errhandler MPI_Win_get_errhandler MPI_Win_set_errhandler group MPI_Group_compare MPI_Group_difference MPI_Group_excl MPI_Group_free MPI_Group_incl MPI_Group_intersection MPI_Group_range_excl MPI_Group_range_incl MPI_Group_rank MPI_Group_size MPI_Group_translate_ranks MPI_Group_union init MPI_Init MPI_Init_thread MPI_Finalize io MPI_File_close MPI_File_delete MPI_File_get_amode MPI_File_get_atomicity MPI_File_get_byte_offset MPI_File_get_group MPI_File_get_info MPI_File_get_position MPI_File_get_position_shared MPI_File_get_size MPI_File_get_type_extent MPI_File_get_view MPI_File_iread_at MPI_File_iread MPI_File_iread_shared MPI_File_iwrite_at MPI_File_iwrite MPI_File_iwrite_shared MPI_File_open MPI_File_preallocate MPI_File_read_all_begin MPI_File_read_all_end MPI_File_read_all MPI_File_read_at_all_begin MPI_File_read_at_all_end MPI_File_read_at_all MPI_File_read_at MPI_File_read MPI_File_read_ordered_begin MPI_File_read_ordered_end MPI_File_read_ordered MPI_File_read_shared MPI_File_seek MPI_File_seek_shared MPI_File_set_atomicity MPI_File_set_info MPI_File_set_size MPI_File_set_view MPI_File_sync MPI_File_write_all_begin MPI_File_write_all_end MPI_File_write_all MPI_File_write_at_all_begin MPI_File_write_at_all_end MPI_File_write_at_all MPI_File_write_at MPI_File_write MPI_File_write_ordered_begin MPI_File_write_ordered_end MPI_File_write_ordered MPI_File_write_shared MPI_Register_datarep one2all MPI_Bcast MPI_Scatter MPI_Scatterv MPI_Ibcast MPI_Iscatter MPI_Iscatterv |
recv MPI_Recv MPI_Irecv MPI_Recv_init MPI_Probe MPI_Iprobe req MPI_Start MPI_Startall MPI_Wait MPI_Waitall MPI_Waitany MPI_Waitsome MPI_Test MPI_Testall MPI_Testany MPI_Testsome MPI_Cancel MPI_Grequest_start MPI_Grequest_complete MPI_Request_get_status MPI_Request_free rma MPI_Accumulate MPI_Get MPI_Put MPI_Win_complete MPI_Win_create MPI_Win_fence MPI_Win_free MPI_Win_get_group MPI_Win_lock MPI_Win_post MPI_Win_start MPI_Win_test MPI_Win_unlock MPI_Win_wait MPI_Win_allocate MPI_Win_allocate_shared MPI_Win_create_dynamic MPI_Win_shared_query MPI_Win_attach MPI_Win_detach MPI_Win_set_info MPI_Win_get_info MPI_Win_get_accumulate MPI_Win_fetch_and_op MPI_Win_compare_and_swap MPI_Rput MPI_Rget MPI_Raccumulate MPI_Rget_accumulate MPI_Win_lock_all MPI_Win_unlock_all MPI_Win_flush MPI_Win_flush_all MPI_Win_flush_local MPI_Win_flush_local_all MPI_Win_sync scan MPI_Exscan MPI_Scan MPI_Iexscan MPI_Iscan send MPI_Send MPI_Bsend MPI_Rsend MPI_Ssend MPI_Isend MPI_Ibsend MPI_Irsend MPI_Issend MPI_Send_init MPI_Bsend_init MPI_Rsend_init MPI_Ssend_init sendrecv MPI_Sendrecv MPI_Sendrecv_replace serv MPI_Alloc_mem MPI_Free_mem MPI_Buffer_attach MPI_Buffer_detach MPI_Op_create MPI_Op_free spawn MPI_Close_port MPI_Comm_accept MPI_Comm_connect MPI_Comm_disconnect MPI_Comm_get_parent MPI_Comm_join MPI_Comm_spawn MPI_Comm_spawn_multiple MPI_Lookup_name MPI_Open_port MPI_Publish_name MPI_Unpublish_name status MPI_Get_count MPI_Status_set_elements MPI_Status_set_cancelled MPI_Test_cancelled sync MPI_Barrier MPI_Ibarrier time MPI_Wtick MPI_Wtime topo MPI_Cart_coords MPI_Cart_create MPI_Cart_get MPI_Cart_map MPI_Cart_rank MPI_Cart_shift MPI_Cart_sub MPI_Cartdim_get MPI_Dims_create MPI_Graph_create MPI_Graph_get MPI_Graph_map MPI_Graph_neighbors MPI_Graphdims_get MPI_Graph_neighbors_count MPI_Topo_test type MPI_Get_address MPI_Get_elements MPI_Pack MPI_Pack_external MPI_Pack_external_size MPI_Pack_size MPI_Type_commit MPI_Type_contiguous MPI_Type_create_darray MPI_Type_create_hindexed MPI_Type_create_hvector MPI_Type_create_indexed_block MPI_Type_create_resized MPI_Type_create_struct MPI_Type_create_subarray MPI_Type_dup MPI_Type_free MPI_Type_get_contents MPI_Type_get_envelope MPI_Type_get_extent MPI_Type_get_true_extent MPI_Type_indexed MPI_Type_size MPI_Type_vector MPI_Unpack_external MPI_Unpack |
Use the I_MPI_STATS_ACCURACY environment variable to decrease statistics output.
I_MPI_STATS_ACCURACY=<percentage>
<percentage> |
Float threshold value |
Set this environment variable to collect data only on those MPI functions that take a larger portion of the elapsed time as a percentage of the total time spent inside all MPI calls.
The following example represents a simple application code and IPM summary statistics format:
int main (int argc, char *argv[]) { int i, rank, size, nsend, nrecv; MPI_Init (&argc, &argv); MPI_Comm_rank (MPI_COMM_WORLD, &rank); nsend = rank; MPI_Wtime(); for (i = 0; i < 200; i++) { MPI_Barrier(MPI_COMM_WORLD); } /* open "reduce" region for all processes */ MPI_Pcontrol(1, "reduce"); for (i = 0; i < 1000; i++) MPI_Reduce(&nsend, &nrecv, 1, MPI_INT, MPI_MAX, 0, MPI_COMM_WORLD); /* close "reduce" region */ MPI_Pcontrol(-1, "reduce"); if (rank == 0) { /* "send" region for 0-th process only */ MPI_Pcontrol(1, "send"); MPI_Send(&nsend, 1, MPI_INT, 1, 1, MPI_COMM_WORLD); MPI_Pcontrol(-1, "send"); } if (rank == 1) { MPI_Recv(&nrecv, 1, MPI_INT, 0, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE); } /* reopen "reduce" region */ MPI_Pcontrol(1, "reduce"); for (i = 0; i < 1000; i++) MPI_Reduce(&nsend, &nrecv, 1, MPI_INT, MPI_MAX, 0, MPI_COMM_WORLD); MPI_Wtime(); MPI_Finalize (); return 0; }
Command:
mpiexec -n 4 -env I_MPI_STATS ipm:terse ./a.out
Stats output:
################################################################################ # # command : ./a.out (completed) # host : svlmpihead01/x86_64_Linux mpi_tasks : 4 on 1 nodes # start : 05/25/11/05:44:13 wallclock : 0.092012 sec # stop : 05/25/11/05:44:13 %comm : 98.94 # gbytes : 0.00000e+00 total gflop/sec : NA # ################################################################################
Command:
mpiexec -n 4 -env I_MPI_STATS ipm ./a.out
Stats output:
################################################################################ # # command : ./a.out (completed) # host : svlmpihead01/x86_64_Linux mpi_tasks : 4 on 1 nodes # start : 05/25/11/05:44:13 wallclock : 0.092012 sec # stop : 05/25/11/05:44:13 %comm : 98.94 # gbytes : 0.00000e+00 total gflop/sec : NA # ################################################################################ # region : * [ntasks] = 4 # # [total] <avg> min max # entries 4 1 1 1 # wallclock 0.332877 0.0832192 0.0732641 0.0920119 # user 0.047992 0.011998 0.006999 0.019996 # system 0.013997 0.00349925 0.002999 0.004 # mpi 0.329348 0.082337 0.0723064 0.0912335 # %comm 98.9398 98.6928 99.154 # gflop/sec NA NA NA NA # gbytes 0 0 0 0 # # # [time] [calls] <%mpi> <%wall> # MPI_Init 0.236192 4 71.71 70.95 # MPI_Reduce 0.0608737 8000 18.48 18.29 # MPI_Barrier 0.027415 800 8.32 8.24 # MPI_Recv 0.00483489 1 1.47 1.45 # MPI_Send 1.50204e-05 1 0.00 0.00 # MPI_Wtime 1.21593e-05 8 0.00 0.00 # MPI_Finalize 3.33786e-06 4 0.00 0.00 # MPI_Comm_rank 1.90735e-06 4 0.00 0.00 # MPI_TOTAL 0.329348 8822 100.00 98.94 ################################################################################ # region : reduce [ntasks] = 4 # # [total] <avg> min max # entries 8 2 2 2 # wallclock 0.0638561 0.015964 0.00714302 0.0238571 # user 0.034994 0.0087485 0.003999 0.015997 # system 0.003999 0.00099975 0 0.002999 # mpi 0.0608799 0.01522 0.00633883 0.0231845 # %comm 95.3392 88.7417 97.1808 # gflop/sec NA NA NA NA # gbytes 0 0 0 0 # # # [time] [calls] <%mpi> <%wall> # MPI_Reduce 0.0608737 8000 99.99 95.33 # MPI_Finalize 3.33786e-06 4 0.01 0.01 # MPI_Wtime 2.86102e-06 4 0.00 0.00 # MPI_TOTAL 0.0608799 8008 100.00 95.34 ################################################################################ # region : send [ntasks] = 4 # # [total] <avg> min max # entries 1 0 0 1 # wallclock 2.89876e-05 7.24691e-06 1e-06 2.59876e-05 # user 0 0 0 0 # system 0 0 0 0 # mpi 1.50204e-05 3.75509e-06 0 1.50204e-05 # %comm 51.8165 0 57.7982 # gflop/sec NA NA NA NA # gbytes 0 0 0 0 # # # [time] [calls] <%mpi> <%wall> # MPI_Send 1.50204e-05 1 100.00 51.82 ################################################################################ # region : ipm_noregion [ntasks] = 4 # # [total] <avg> min max # entries 13 3 3 4 # wallclock 0.26898 0.0672451 0.0661182 0.068152 # user 0.012998 0.0032495 0.001 0.004999 # system 0.009998 0.0024995 0 0.004 # mpi 0.268453 0.0671132 0.0659676 0.068049 # %comm 99.8039 99.7721 99.8489 # gflop/sec NA NA NA NA # gbytes 0 0 0 0 # # # [time] [calls] <%mpi> <%wall> # MPI_Init 0.236192 4 87.98 87.81 # MPI_Barrier 0.027415 800 10.21 10.19 # MPI_Recv 0.00483489 1 1.80 1.80 # MPI_Wtime 9.29832e-06 4 0.00 0.00 # MPI_Comm_rank 1.90735e-06 4 0.00 0.00 # MPI_TOTAL 0.268453 813 100.00 99.80 ################################################################################