OU Supercomputing Center for Education & Research
University of Oklahoma   OSCER   OU IT

 

 

Quick & Dirty Introduction to Compiling and Running on Sooner


For help, please contact us.


Table of Contents


Compiling

I. Determine What Kind of Parallelism Your Program Uses, If Any

A. Determine Whether Your Program Uses MPI

If you aren't sure whether your program uses the Message Passing Interface (MPI), then it probably doesn't.

But, here's how you can find out for sure:

  1. Log in to sooner.oscer.ou.edu.
     
  2. At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
     
  3. At the Unix prompt, type this command:
     
    grep "mpi.h" *.[cChH]*
     
  4. At the Unix prompt, type this command:
     
    grep -i "mpif.h" *.[fF]*
     
  5. At the Unix prompt, type this command:
     
    grep -i "use mpi" *.[fF]*

If, in response to any of these grep commands, you get anything back other than the Unix prompt (and especially anything that has the word "include" or the word "use" in it), then very probably your code uses MPI.

If not, then very probably your code doesn't use MPI.

If you're still unsure, please contact us and we'll help.

B. Determine Whether Your Program Uses OpenMP

If you aren't sure whether your program uses OpenMP, then it probably doesn't.

But, here's how you can find out for sure:

  1. Log in to sooner.oscer.ou.edu.
     
  2. At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
     
  3. At the Unix prompt, type this command:
     
    grep "pragma" *.[cChH]* | grep "omp"
     
  4. At the Unix prompt, type this command:
     
    egrep -e '\!\$[oO][mM][pP]' *.[fF]*

If, in response to any of these grep or egrep commands, you get anything back other than the Unix prompt (and especially anything that has the word "include" or the word "parallel" in it), then very probably your code uses OpenMP.

If not, then very probably your code doesn't use OpenMP.

If you're still unsure, please contact us and we'll help.

C. Determine Whether Your Program Uses POSIX Threads (pthreads)

If you aren't sure whether your program uses POSIX threads (pthreads), then it probably doesn't.

But, here's how you can find out for sure:

  1. Log in to sooner.oscer.ou.edu.
     
  2. At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
     
  3. At the Unix prompt, type this command:
     
    grep -i "pthread" *.[cChHfF]*

If, in response to this grep command, you get anything back other than the Unix prompt (and especially anything that has the word "pthread" in it), then very probably your code uses POSIX threads (pthreads).

Another possibility is that your code uses the Basic Linear Algebra Subprograms (BLAS). In that case, very probably the implementation of the BLAS that you use will have POSIX threads (pthreads) enabled.

Note that if your code uses any of the following (and this list isn't exhaustive), then it probably uses the BLAS or otherwise needs POSIX threads (pthreads):

If not, then very probably your code doesn't use POSIX threads (pthreads).

If you're still unsure, please contact us and we'll help.


II. Compilers and Options

A. Compilers Available on Sooner

On sooner.oscer.ou.edu, we support the following compilers:

  • Intel compiler family (all version 10.1)
    • icc for C
    • icpc for C++
    • ifort (formerly known as ifc) for Fortran 77/90/95
  • Portland Group compiler family (all version 7.0-7)
    • pgcc for C.hack sign Op & End single
    • pgCC for C++
    • pgf77 for Fortran 77
    • pgf90 for Fortran 90/95
    • pghpf for High Performance Fortran
  • GNU compiler family (all version 4.1.2)
    • gcc for C
    • g++ for C++
    • gfortran for Fortran 77/90/95
    • g77 for Fortran 77
       
      NOTE: In Red Hat Enterprise Linux 5 (and presumably in future releases), g77 has been supplanted by gfortran, but for completeness we've provided a soft link (like a shortcut in Microsoft Windows) from gfortran to g77 so that makefiles and other build scripts that expect to use g77 will seem to find g77 (but will actually use gfortran).
  • Numerical Algorithms Group (NAG) compiler family
    • nagf95 for Fortran 77/90/95 (version 5.2)
       
      NOTE: The nagf95 compiler is actually a Fortran-to-C translator that then calls gcc (see above).
       
      NOTE: The nagf95 compiler hasn't yet been installed on Sooner (as of August 18 2008).

B. Recommended Compiler Performance Optimization Options

Each family of compilers has its own unique compiler options; you can find complete lists both in their man pages (online manuals) and in their manuals.

So, the following are merely the recommended options for reasonably vanilla source codes.


III. How to Compile

A. How to Compile a Non-Parallel Program

If you've determined that your program uses no parallelism — that it's a "serial" (non-parallel) program — then here's how to compile.

  1. Choose the compiler that you want.
     
  2. Try the recommended compiler options, using the EXACT SAME COMPILER OPTIONS for EVERY SINGLE COMPILE COMMAND in your compilation procedure (including the link command).
     
    For example:
     
    ifort -O -march=core2 -mtune=core2 -c mysourcefile1.f90
    ifort -O -march=core2 -mtune=core2 -c mysourcefile2.f90
    ...
    ifort -O -march=core2 -mtune=core2 -c mysourcefileN.f90
    ifort -O -march=core2 -mtune=core2 -c mymainprogram.f90
    ifort -O -march=core2 -mtune=core2 -o myexecutable \
        mymainprogram.o mysourcefile1.o ... mysourcefileN.o
     
    NOTE: The backslash \ at the end of the first line of the final compile (link) command is the Unix/Linux continuation character: it means that the command continues onto the next line. You don't absolutely need it — you can just type the entire link command on a single line — but it makes the link command more readable, which is good.
     
  3. You're also welcome to try other compiler options, as described in the compiler manuals and man pages (online manuals).
     
  4. If you have several source files, we STRONGLY RECOMMEND creating a library archive file.
     
    For example:
     
    ifort -O -march=core2 -mtune=core2 -c mysourcefile1.f90
    ifort -O -march=core2 -mtune=core2 -c mysourcefile2.f90
    ...
    ifort -O -march=core2 -mtune=core2 -c mysourcefileN.f90
    ifort -O -march=core2 -mtune=core2 -c mymainprogram.f90
    ar ru libmysource.a mysourcefile1.o ... mysourcefileN.o
    ranlib libmysource.a
    ifort -O -march=core2 -mtune=core2 -o myexecutable \
        mymainprogram.o -L. -lmysource
     
    NOTE: The backslash \ at the end of the first line of the final compile (link) command is the Unix/Linux continuation character: it means that the command continues onto the next line. You don't absolutely need it — you can just type the entire link command on a single line — but it makes the link command more readable, which is good.
     
    NOTE: The ar command creates the library archive file libmysource.a, containing the binary versions of all of the routines in all of the object files (mysourcefileK.o for K from 1 to N).
     
    NOTE: The ranlib command makes the library archive file libmysource.a smarter about its own contents, so we STRONGLY RECOMMEND using it.
     
    NOTE: In the final compile (link) command, the -L. option means: "Look for library archive files named lib*.a or lib*.so* in the current working directory."
     
    NOTE: In the final compile (link) command, the -lmysource option means: "Link to the library archive file named libmysource.a (if linking statically) or libmysource.so* (if linking dynamically)."
     
    Note that the second character is a lower case L, NOT a one.
     
    For example, if you write C or C++ code, you may have used the following:
     
    -lm
     
    which means "link to the C/C++ standard math library," which includes functions such as sqrt.
     
  5. NOTE: When compiling (actually linking) with the Intel compiler family, a common warning is:
                    /usr/intel10/lib/libimf.so:
                    warning: warning: feupdateenv is not implemented
                    and will always fail
                    
    You may IGNORE this warning; to our knowledge, no one has ever been harmed by it.
     
    Rumor has it that adding the following compiler option to ALL compile/link commands will make this warning go away:
     
    -shared-intel
     
    You can instead use the following:
     
    -i-dynamic
     
    However, Intel now deprecates (severely frowns on) the second option, so we recommend the first.
     
  6. You may find it valuable to try compiling and running your code with each of the compiler families, and with various combinations of compiler options, to find the best combination for your code.
     
  7. Please bear in mind that some codes can only be compiled by a subset of the compilers that are available on Sooner.

B. How to Compile an OpenMP Program

OpenMP is supported by the following compilers, using the associated compiler option for EVERY SINGLE COMPILE COMMAND in your compilation procedure (including the link command):

Compiling an OpenMP program is exactly the same as compiling a non-parallel (serial) program, except that you add the appropriate OpenMP compiler option to EVERY compile/link command.
 
For example:
 
ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefile1.f90
ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefile2.f90
...
ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefileN.f90
ifort -O -march=core2 -mtune=core2 -openmp -c mymainprogram.f90
ar ru libmysource.a mysourcefile1.o ... mysourcefileN.o
ranlib libmysource.a
ifort -O -march=core2 -mtune=core2 -openmp -o myexecutable \
    mymainprogram.o -L. -lmysource

C. How to Compile a POSIX Threads (pthreads) Program

POSIX threads (pthreads) are supported by all of the compilers installed on Sooner.

To compile with POSIX threads (pthreads), all you need to do is, in the final link step of compiling, append the following at the very end of the link command:

-lpthread

This means, "link to the POSIX threads (pthreads) library."


 
Note that the second character is a lower case L, NOT a one.
 
For example, if you write C or C++ code, you may have used the following:
 
-lm
 
which means "link to the C/C++ standard math library," which includes functions such as sqrt.

D. How to Compile an MPI Program

1. Before Compiling an MPI Program

Before compiling an MPI program, you need to set some environment variables:

  • MPI_COMPILER
     
    This environment variable indicates the family of compilers that you're using.
     
    NOTE: MPIENV, an equivalent environment variable, is deprecated (frowned upon for future use). Please transition your builds and batch scripts to MPI_COMPILER instead of MPIENV.
     
  • MPI_INTERCONNECT
     
    This environment variable indicates the communication hardware that you're using to send MPI messages between MPI processes.
     
    NOTE: MPIDEV, an equivalent environment variable, is deprecated (frowned upon for future use). Please transition your builds and batch scripts to MPI_INTERCONNECT instead of MPIDEV.
     
  • MPI_VENDOR
     
    This environment variable indicates a subcategory of communication that you're using; you only need to set this environment variable under a few certain specific circumstances.

A quick discussion of how to set environment variables is below.

MPI_COMPILER
 
This environment variable indicates the family of compilers that you're using.
 
OSCER supports multiple compiler families on Sooner (see above).
Possible values that you can set MPI_COMPILER to are:

  • intel or intel10
     
    Either of these values indicates that you want to compile using the Intel compiler family, version 10.1 (the default version).
     
  • intel9
     
    This value indicates that you want to compile using the Intel compiler family, version 9.1 (an older version).
     
    NOTE: We STRONGLY RECOMMEND AGAINST using older compiler versions such as intel9, which is included for backward compatibility only and which we'll shut down at the earliest opportunity (so it may not even exist by the time you read this).
     
  • pgi
     
    This value indicates that you want to compile using the Portland Group compiler family (version 7.1-7).
     
  • gnu or gcc
     
    Either of these values indicates that you want to compile using the GNU compiler family (version 4.1.2).
     
  • nag
    This value indicates that you want to compile using the NAG compiler (version 5.1) for Fortran 77/90/95 and the GNU compiler family (version 4.1.2) for C and C++.
     
    NOTE: The NAG compiler family includes only a Fortran 90/95 compiler, which can also be used to compile Fortran 77 code, but for C and C++ the GNU compilers are used. (The nagf95 compiler is a Fortran-to-C translator that then invokes gcc, so these compilers should be fully compatible.)

MPI_INTERCONNECT
 
This environment variable indicates the communication hardware that you're using to send MPI messages between MPI processes.
 
Possible values you can set MPI_INTERCONNECT to are:

  • ib
     
    This value means that MPI messages will be sent over the high performance Infiniband interconnect, using the native Infiniband software drivers, known as MVAPICH. This is the fastest way to communicate among multiple nodes.
     
    This is the default, and in general you should use this unless you have a VERY GOOD REASON to do otherwise (for example, if Infiniband is incompatible with the compiler you're using, as shown in the compatibility matrix below).
     
  • gige
     
    This value means that MPI messages will be sent over Gigabit Ethernet, using the slower TCP/IP software driver that is used for standard Ethernet (e.g., for sending data on the Internet). This is a substantially slower way to communicate among multiple nodes, and should be used only if you have a very good reason (such as needing the PGI or NAG compilers).
     
  • shmem
     
    This value means that MPI messages will be sent entirely inside RAM.
     
    This value should only be used if your MPI executable is guaranteed to be run inside a single individual node (i.e., on at most 8 MPI processes).

MPI_VENDOR
 
This environment variable indicates some more specifics how to send MPI messages between MPI processes via Infiniband. The only possible value you can set MPI_VENDOR to is:

  • openmpi
     
    This value means that MPI messages will be sent using OpenMPI, rather than MVAPICH.

If the MPI_VENDOR environment variable is not defined, but MPI_INTERCONNECT is set to ib, then MVAPICH will be used as the Infiniband communication mechanism.
 

2. Compiler/Interconnect Compatibility Matrix
 

COMPILER INTERCONNECT
  ib
(Infiniband hardware via MVAPICH software)
ib/openmpi (Infiniband hardware via OpenMPI software) gige
 
(Gigabit Ethernet hardware via TCP/IP software)
shmem
(using RAM inside a single node)
gcc X X X X
g++ X X X X
g77 (soft linked to gfortran) X X X X
gfortran X X X X
icc X X X X
icpc X X X X
ifort X X X X
pgcc X X X X
pgCC X X X X
pgf77 X X X X
pgf90 X X X X
pghpf        
nagf95     NOT YET NOT YET

E. How to Compile a CUDA Program

1. Before Compiling a CUDA Program

Before compiling a CUDA program, you need to set some environment variables:

  • CUDA_PATH
     
    This environment variable indicates the location of the version of CUDA you're using.
     
  • PATH
     
    This environment variable indicates the locaton(s) of exectuables that you're using.
     
  • LD_LIBRARY_PATH
     
    This environment variable indicates the locaton(s) of libraries that you're using;

A quick discussion of how to the set environment variables is below.

CUDA_PATH
 
This environment variable indicates the location of the version of CUDA you're using.
 
OSCER supports multiple CUDA versions on Sooner (see above).
Possible values that you can set CUDA_PATH to are:

  • /home/software/CUDA/2.3/install/toolkit/cuda
     
  • /home/software/CUDA/3.0/install/cuda
     
  • /home/software/CUDA/3.2/install/cuda
     
  • /home/software/CUDA/4.0/cuda
     
    NOTE: As newer versions of CUDA become available, older versions will be depricated and eventually removed from the system.

PATH
 
This environment variable indicates the locaton(s) of exectuables that you're using.
 
Simply edit the existing PATH for CUDA compliation by adding:

  • ${CUDA_PATH}/bin:${CUDA_PATH}/open64/bin:${PATH}
     
    NOTE: The "${CUDA_PATH}" means take the value of CUDA_PATH and place its value here.

LD_LIBRARY_PATH
 
This environment variable indicates the locaton(s) of libraries that you're using; The only possible value you can set LD_LIBRARY_PATH to is:

  • ${CUDA_PATH}/lib:${CUDA_PATH}/open64/lib:${CUDA_PATH}/lib64:${LD_LIBRARY_PATH}
     
    This value means the Nvidia CUDA Compiler will look for necessary libraries need for successful compilation in the indicated locations.

3. How to Set Environment Variables
 

How to set these environment variables depends on which Unix shell you are using. If you're not sure which Unix shell you're using (or if you're not sure what a "Unix shell" is), then type this at the Unix prompt and then press the "Enter" key:

ps

The response will look something like this:

  PID TTY          TIME CMD
28826 pts/19   00:00:00 tcsh
30567 pts/19   00:00:00 ps

If your shell is tcsh, then to set the environment variables, type this (perhaps with different values for the environment variables) at the Unix prompt:

setenv MPI_COMPILER intel
setenv MPI_INTERCONNECT ib

If your shell is bash or ksh or zsh, then to set the environment variables, type this at the Unix prompt:

export MPI_COMPILER=intel
export MPI_INTERCONNECT=ib

If your shell is sh, then to set the environment variables, type this at the Unix prompt:

MPI_COMPILER=intel
export MPI_COMPILER
MPI_INTERCONNECT=ib
export MPI_INTERCONNECT

If your shell is something else, then please contact us and we'll try to help.

4. How to Compile an MPI Program After You've Set the Environment Variables

After you've set the environment variables, compile using one of the following compile commands:

  • mpicc for C code
     
    NOTE: Use this INSTEAD OF icc, gcc or pgcc.
     
  • mpiCC for C++ code
     
    NOTE: Use this INSTEAD OF icpc, g++ or pgCC.
     
  • mpif77 for Fortran 77 code
     
    NOTE: Use this INSTEAD OF ifort, g77, gfortran, nagf95, or pgf77.
     
  • mpif90 for Fortran 90 code
     
    NOTE: Use this INSTEAD OF ifort, gfortran, nagf95, or pgf90.

In all other details, the usage is exactly the same as when compiling a non-parallel code. In particular, you will use the same compiler options as you would have used for the compiler family that you chose when setting MPI_COMPILER.

For example, to compile an MPI code written in Fortran 90 using the Intel compiler family, the compile command might start with:

mpif90 -O -march=core2 -mtune=core2 ...

WARNING! Be sure that EVERY SINGLE COMPILE COMMAND in your compilaton procedure, including the final compile command for linking the executable, uses the EXACT SAME COMPILER OPTIONS!

Running

I. DON'T RUN INTERACTIVELY!

Except for very small, very brief non-MPI runs on a few CPU cores within a single node, ALL runs should be performed using the LSF batch queue system.

II. Run Using the LSF Batch System

To start, you'll want to make a copy of one of the following batch script files:

  • ~hneeman/example_nonparallel.bsub
     
    Use this batch script file to run any batch job that uses EXACTLY ONE CPU CORE inside a single compute node (that is, a non-parallel job).
     
  • ~hneeman/example_parallel_sharedmem.bsub
     
    Use this batch script file to run any batch job that uses shared memory parallelism via either OpenMP or POSIX threads (pthreads) on up to 8 CPU cores inside a single compute node.
     
  • ~hneeman/example_parallel_mpi.bsub
     
    Use this batch script file to run any batch job that uses purely distributed parallelism via MPI on any number of nodes and any number of CPU cores per node (up to 8).
     
  • ~hneeman/example_parallel_hybrid.bsub
     
    Use this batch script file to run any batch job that uses a hybrid of two kinds of parallelism: (a) distributed parallelism via MPI on any number of nodes and any number of CPU cores per node (up to 8, though we recommend an upper limit of 4), and (b) shared memory parallel via either OpenMP or POSIX threads (pthreads). We recommend that the total number of threads per node be at most 8, because you almost certainly don't want more threads than cores per node.
     
  • ~hneeman/example_fat.bsub
     
    Use this batch script file to run any nonparallel (serial) or shared memory parallel batch job that uses 1 to 16 CPU cores inside a single fat node. The batch job can be either (a) non-parallel (serial) or (b) shared memory parallel via either OpenMP or POSIX threads (pthreads).
     
  • ~hneeman/example_nonparallel_cuda.bsub
     
    Use this batch script file to run any nonparallel cuda batch job that uses 1 CPU core inside a single node but could use up to two CUDA devices.
     
  • ~hneeman/example_parallel_mpi_cuda.bsub
     
    Use this batch script file to run any parallel cuda batch job that uses multiple CPU cores inside a multiple nodes AND uses multiple CUDA devices.
     

Once you've identified which batch script file to use, do this:

cp  ~hneeman/example_nonparallel.bsub  whatever.bsub

where whatever.bsub is the name of the batch script file that you want to create; typically, you'll replace whatever with the name of your executable or the experiment or something.

Note that you could use other of the example_*.bsub files instead of example_nonparallel.bsub.

You should be able to modify your copy of the batch script file to suit your needs. It contains detailed information about how to set up and run a batch job.

IMPORTANT NOTE!!!
In your batch script, you MUST use the absolute FULL PATH for your executable! DON'T use a relative path nor leave out the path!

III. Running a MATLAB Application

Click here.

IV. Debugging using Totalview debugger

Click here.

V. Running X Windows applications

Click here.


For help, please contact us.


 

 


Copyright (C) 2002-2013 University of Oklahoma