This is also a solution to:

  • mpicc” or “mpif90” command not found
  • Sample C program for Open MPI under Fedora
  • libmpi.so.1: cannot open shared object file” problem, etc.

The Open MPI is an open source “Message Passing Interface” implementation for High Performance Computing or Supercomputing, which is developed and maintained by a consortium of research, academic and industry partners.

After a successful Fedora 15 installation, I found

  • blacs-openmpi (BLACS – Basic Linear Algebra Communication Subprograms),
  • openmpi, and
  • scalapack-openmpi (SCALAPACK – Scalable LAPACK (Linear Algebra PACKage))

are already installed on my system.

openmpi” package simply provides execution tools like mpiexec, mpirun and others like ompi-clean, ompi-iof, ompi-ps, etc.

To have the compiler wrappers (for C, C++ and Fortran, i.e.: mpicc, mpic++, mpif77, mpif90, etc.) and their respective VampirTrace, you need to install “openmpi-devel” (no need of mpich/mpich2). So issue:

$ sudo yum install openmpi-devel

Fedora does not install the required compiler executables under default /usr/bin, /usr/sbin or /usr/local/bin, etc. and libraries under default /usr/lib or /usr/lib64, etc., but under /usr/lib64/openmpi/bin and /usr/lib64/openmpi/lib. And the $PATH variable is not updated to include these paths. This results in “mpicc: Command not found” error when you try to issue such commands.

Sample Master/Slave C program:

// master_slave.c

#include <stdio.h>
#include <mpi.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
        int num_procs, rank, namelength;
        char processor_name[MPI_MAX_PROCESSOR_NAME];

        MPI_Init(&argc, &argv);
        MPI_Comm_size(MPI_COMM_WORLD, &num_procs);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        MPI_Get_processor_name(processor_name, &namelength);

        if (rank == 0) {
                printf("[%02d/%02d %s]: I am the master of all known universe\n", rank+1, num_procs, processor_name);
        // Tell the slaves to bring some wine or a Cat o' nine tails.
        }
        else {
                printf("[%02d/%02d %s]: I am a humble poor slave\n", rank+1, num_procs, processor_name);
        // Wait for some orders or 1001 lashes...
        }

        MPI_Finalize();
}

Compile it with:

$ /usr/lib64/openmpi/bin/mpicc master_slave.c
or
$ /usr/lib64/openmpi/bin/mpicc master_slave.c -o master_slave

Run the executable with:

$ /usr/lib64/openmpi/bin/mpirun a.out
or
$ /usr/lib64/openmpi/bin/mpirun master_slave

mpicc is a wrapper for gcc or other default compiler with necessary compiler/linker flags. Open MPI team “strongly” suggests not to use the underlying default compilers. Without humoring their suggestion let’s go for gcc compilation.

$ gcc -I/usr/include/openmpi-x86_64 -L/usr/lib64/openmpi/lib -pthread -m64 -lmpi -ldl -Wl,–export-dynamic -lnsl -lutil -lm -ldl master_slave.c -o master_slave

Above gcc does both compilation and linking at the same time. If you want to do compilation and linking separately:

$ gcc -I/usr/include/openmpi-x86_64 -pthread -m64 -c master_slave.c
$ gcc -L/usr/lib64/openmpi/lib -pthread -m64 -lmpi -ldl -Wl,–export-dynamic -lnsl -lutil -lm -ldl master_slave.o -o master_slave
$ /usr/lib64/openmpi/bin/mpirun master_slave

You will find the necessary compiler/linker flags by issuing:

$ /usr/lib64/openmpi/bin/mpicc –showme:compile

-I/usr/include/openmpi-x86_64 -pthread -m64
$ /usr/lib64/openmpi/bin/mpicc –showme:link

-pthread -m64 -L/usr/lib64/openmpi/lib -lmpi -ldl -Wl,–export-dynamic -lnsl -lutil -lm -ldl

Under a simulated slow network environment with four systems connected, the output would be:

[01/04 Fedora15.MacBookPro]: I am the master of all known universe
[02/04 Majnun]: I am a humble poor slave
[03/04 Devdas]: I am a humble poor slave
[04/04 Aesop]: I am a humble poor slave

Tips:

  • Add “PATH=$PATH:/usr/lib64/openmpi/bin” without quotes to your .bash_profile so that you don’t need to type the entire “/usr/lib64/openmpi/bin” path repeatedly. Or as a temporary fix issue:

$ export PATH=$PATH:/usr/lib64/openmpi/bin

  • 1) When you try to run ./a.out or ./master_slave directly it will give the following error:

./master_slave: error while loading shared libraries: libmpi.so.1: cannot open shared object file: No such file or directory

  • 2) When you try to run mpirun a.out or mpirun master_slave it will give the following error:

master_slave: error while loading shared libraries: libmpi.so.1: cannot open shared object file: No such file or directory
————————————————————————–
mpirun noticed that the job aborted, but has no info as to the process that caused that situation.
————————————————————————–

  • 1) and 2) happen so because the loader is not able to find the libmpi.so due to non availability of /usr/lib64/openmpi/lib in its PATH (LD_LIBRARY_PATH). To fix it add “LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/openmpi/lib” without quotes to your .bash_profile. Or as temporary fix issue:

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/openmpi/lib

  • To make this change at system level (i.e.: effective to all users), as root, edit the /etc/ld.so.conf file, add “/usr/lib64/openmpi/lib” without quotes and run ldconfig or reboot the system.
  • Now you can issue ./a.out or ./master_slave directly or mpirun a.out or mpirun master_slave. All will give the same result, though stick to mpirun, which has more controls and options. See here the man page.

Thanks for the read and please leave comments🙂