Skip to content

Memory leak with openib component when using MPI_AllToAllW #8290

Open
@patrick-legi

Description

@patrick-legi

Thank you for taking the time to submit an issue!

Background information

Using MPI_type_Create_SubArray and MPI_AllToAllW show a memory leak with openib component.

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

OpenMPI 3.1 and OpenMPI 4.0.5.
OpenMPI 1.10 works fine.

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

OpenMPI built from sources using gcc7.3 (built from sources too) on CentOS6 (Rocks-Cluster) distribution.
Configure command line: '--prefix=/share/apps/GCC73/openmpi/31-patch'
'--enable-mpirun-prefix-by-default'
'--disable-dlopen' '--enable-mpi-cxx'
'--without-slurm' '--enable-mpi-thread-multiple'
test_layout_array.tar.gz
alltoallw.diff.gz

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

Please describe the system on which you are running

  • Operating system/version:
    CentOS release 6.7 (Final) as an update of Rocks release 6.2 (SideWinder)

  • Computer hardware:
    DELL R720, DELL C6200

  • Network type:
    Infiniband QDR (Qlogic), Ethernet 10Gb


Details of the problem

Using MPI_type_Create_SubArray and MPI_AllToAllW show a memory leak (memory used by the processes increase continuously) when using openib components in OpenMPI 3 and OpenMPI4. Disabling this component with "mpirun --mca pml ob1 --mca btl tcp,self.... " do not show the memory leak but is not usable in production (10Gb ethernet interconnect instead of ib QDR).

Gilles Gouaillardet has provided a patch (workaround) to solve the problem and we are back in production.

Attached file:

  • Small peace of code to reproduce the problem (test_layout_array.tar.gz)
  • The patch of alltoallw provided by Gilles Gouaillardet (alltoaal.diff.gz)
  • The memory consumption (for the test case) before and after applying the patch to OpenMPI 3.1 (png file)
    patch

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions