$subject_val = "[OMPI users] Valgrind writev() errors with 1.3.2."; include("../../include/msg-header.inc"); ?>
Subject: [OMPI users] Valgrind writev() errors with 1.3.2.
From: tom fogal (tfogal_at_[hidden])
Date: 2009-06-08 15:09:10
Hi all,
I've configured a source build of OpenMPI 1.3.2 with valgrind enabled
[1], and I'm seeing a lot of errors with writev() when I run this under
valgrind. For example, with the following `hello, world' program:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv);
puts("Hello, world!");
MPI_Finalize();
return 0;
}
I see errors like the following:
==12342== Syscall param writev(vector[...]) points to uninitialised byte(s)
==12342== at 0x61DF733: writev (in /lib/libc-2.7.so)
==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137)
==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369)
==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497)
==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626)
The full vg log is appended [2]. Of course, I could just suppress
this error, but I get this for a lot (every?) MPI call which does
communication, it seems (broadcasts, sends, recv's, allgathers, etc.).
I'm worried a suppression would suppress too much / suppress an error
I've caused.
Have others seen this? Can I suppress perhaps from the
orte_rml_oob_send_buffer down (safely)?
-tom
[1] configured via: gnu_pkg \
--enable-debug \
--enable-memchecker \
--disable-mpi-f77 \
--enable-pretty-print-stacktrace \
--enable-cxx-exceptions \
--enable-mpi-threads \
--with-valgrind=${PREFIX} \
--without-gm \
--without-mx \
--without-openib \
--without-psm \
--with-pic \
--with-gnu-ld
where gnu_pkg is basically a function which calls configure with
--prefix=${PREFIX}.
[2]
==12342== Memcheck, a memory error detector.
==12342== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==12342== Using LibVEX rev 1884, a library for dynamic binary translation.
==12342== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==12342== Using valgrind-3.4.1, a dynamic binary instrumentation framework.
==12342== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==12342== For more details, rerun with: -v
==12342==
==12342== My PID = 12342, parent PID = 12341. Prog and args are:
==12342== ./a.out
==12342==
==12342== Warning: client syscall munmap tried to modify addresses 0xffffffffffffffff-0xffe
==12342== Syscall param writev(vector[...]) points to uninitialised byte(s)
==12342== at 0x61DF733: writev (in /lib/libc-2.7.so)
==12342== by 0x7889AB9: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==12342== by 0x788B1A0: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==12342== by 0x788FF2A: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==12342== by 0x767C7EC: orte_rml_oob_send (rml_oob_send.c:137)
==12342== by 0x767D19A: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==12342== by 0x7C9F3DF: allgather (grpcomm_bad_module.c:369)
==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497)
==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626)
==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80)
==12342== by 0x400857: main (hello.c:5)
==12342== Address 0x677697b is 107 bytes inside a block of size 256 alloc'd
==12342== at 0x4C22A51: realloc (vg_replace_malloc.c:429)
==12342== by 0x53DCBE0: opal_dss_buffer_extend (dss_internal_functions.c:63)
==12342== by 0x53DE4BA: opal_dss_copy_payload (dss_load_unload.c:164)
==12342== by 0x7C9F314: allgather (grpcomm_bad_module.c:363)
==12342== by 0x7C9FD9E: modex (grpcomm_bad_module.c:497)
==12342== by 0x4E6DCAF: ompi_mpi_init (ompi_mpi_init.c:626)
==12342== by 0x4EAAC88: PMPI_Init (pinit.c:80)
==12342== by 0x400857: main (hello.c:5)
==12342== Uninitialised value was created by a stack allocation
==12342== at 0x53FFA60: opal_ifinit (if.c:147)
{
<insert a suppression name here>
Memcheck:Param
writev(vector[...])
fun:writev
fun:mca_oob_tcp_msg_send_handler
fun:mca_oob_tcp_peer_send
fun:mca_oob_tcp_send_nb
fun:orte_rml_oob_send
fun:orte_rml_oob_send_buffer
fun:allgather
fun:modex
fun:ompi_mpi_init
fun:PMPI_Init
fun:main
}
==12342==
==12342== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 307 from 3)
==12342== malloc/free: in use at exit: 204,012 bytes in 2,022 blocks.
==12342== malloc/free: 10,382 allocs, 8,360 frees, 14,603,162 bytes allocated.
==12342== For a detailed leak analysis, rerun with: --leak-check=yes
==12342== For counts of detected errors, rerun with: -v