OSCER: Sooner's Quick 'n' Dirty Guide to Compiling and Running, OU Supercomputing Center for Education & Research, University of Oklahoma, Norman

OU Supercomputing Center for Education & Research

Home About Accounts Education Events Participants Research People Hardware Acknowledging Contact

Quick & Dirty Introduction to Compiling and Running on Sooner

For help, please contact us.

Table of Contents

Compiling

Determine What Kind of Parallelism Your Program Uses, If Any

Determine Whether Your Program Uses MPI

Determine Whether Your Program Uses OpenMP

Determine Whether Your Program Uses POSIX Threads (pthreads)

Compilers and Options

Compilers Available on Sooner

Recommended Compiler Performance Optimization Options

How to Compile

How to Compile a Non-Parallel Program

How to Compile an OpenMP Program

How to Compile a POSIX Threads (pthreads) Program

How to Compile an MPI Program

Before Compiling an MPI Program

Compiler/Interconnect Compatibility Matrix

How to Set Environment Variables

Compiling an MPI Program

Running

DON'T RUN INTERACTIVELY!

Run Using the LSF Batch System

Running a MATLAB Application

Debugging using Totalview debugger

Running X Windows applications

Compiling

I. Determine What Kind of Parallelism Your Program Uses, If Any

A. Determine Whether Your Program Uses MPI

If you aren't sure whether your program uses the Message Passing Interface (MPI), then it probably doesn't.

But, here's how you can find out for sure:

Log in to sooner.oscer.ou.edu.
At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
At the Unix prompt, type this command:

grep "mpi.h" *.[cChH]*
At the Unix prompt, type this command:

grep -i "mpif.h" *.[fF]*
At the Unix prompt, type this command:

grep -i "use mpi" *.[fF]*

If, in response to any of these grep commands, you get anything back other than the Unix prompt (and especially anything that has the word "include" or the word "use" in it), then very probably your code uses MPI.

If not, then very probably your code doesn't use MPI.

If you're still unsure, please contact us and we'll help.

B. Determine Whether Your Program Uses OpenMP

If you aren't sure whether your program uses OpenMP, then it probably doesn't.

But, here's how you can find out for sure:

Log in to sooner.oscer.ou.edu.
At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
At the Unix prompt, type this command:

grep "pragma" *.[cChH]* | grep "omp"
At the Unix prompt, type this command:

egrep -e '\!\$[oO][mM][pP]' *.[fF]*

If, in response to any of these grep or egrep commands, you get anything back other than the Unix prompt (and especially anything that has the word "include" or the word "parallel" in it), then very probably your code uses OpenMP.

If not, then very probably your code doesn't use OpenMP.

If you're still unsure, please contact us and we'll help.

C. Determine Whether Your Program Uses POSIX Threads (pthreads)

If you aren't sure whether your program uses POSIX threads (pthreads), then it probably doesn't.

But, here's how you can find out for sure:

Log in to sooner.oscer.ou.edu.
At the Unix prompt, change directory (cd) to the subdirectory where your source code lives.
At the Unix prompt, type this command:

grep -i "pthread" *.[cChHfF]*

If, in response to this grep command, you get anything back other than the Unix prompt (and especially anything that has the word "pthread" in it), then very probably your code uses POSIX threads (pthreads).

Another possibility is that your code uses the Basic Linear Algebra Subprograms (BLAS). In that case, very probably the implementation of the BLAS that you use will have POSIX threads (pthreads) enabled.

Note that if your code uses any of the following (and this list isn't exhaustive), then it probably uses the BLAS or otherwise needs POSIX threads (pthreads):

If not, then very probably your code doesn't use POSIX threads (pthreads).

If you're still unsure, please contact us and we'll help.

II. Compilers and Options

A. Compilers Available on Sooner

On sooner.oscer.ou.edu, we support the following compilers:

Intel compiler family (all version 10.1)
- icc for C
- icpc for C++
- ifort (formerly known as ifc) for Fortran 77/90/95
Portland Group compiler family (all version 7.0-7)
- pgcc for C.hack sign Op & End single
- pgCC for C++
- pgf77 for Fortran 77
- pgf90 for Fortran 90/95
- pghpf for High Performance Fortran
GNU compiler family (all version 4.1.2)
- gcc for C
- g++ for C++
- gfortran for Fortran 77/90/95
- g77 for Fortran 77
  
  NOTE: In Red Hat Enterprise Linux 5 (and presumably in future releases), g77 has been supplanted by gfortran, but for completeness we've provided a soft link (like a shortcut in Microsoft Windows) from gfortran to g77 so that makefiles and other build scripts that expect to use g77 will seem to find g77 (but will actually use gfortran).
Numerical Algorithms Group (NAG) compiler family
- nagf95 for Fortran 77/90/95 (version 5.2)
  
  NOTE: The nagf95 compiler is actually a Fortran-to-C translator that then calls gcc (see above).
  
  NOTE: The nagf95 compiler hasn't yet been installed on Sooner (as of August 18 2008).

B. Recommended Compiler Performance Optimization Options

Each family of compilers has its own unique compiler options; you can find complete lists both in their man pages (online manuals) and in their manuals.

So, the following are merely the recommended options for reasonably vanilla source codes.

Intel compiler family (all version 10.1)
-O -march=core2 -mtune=core2
Portland Group compiler family (all version 7.0-7)
-fastsse -tp core2-64
GNU compiler family (all version 4.1.2)
-O
Numerical Algorithms Group (NAG) Fortran 90/95 compiler (version 5.1)
-O

III. How to Compile

A. How to Compile a Non-Parallel Program

If you've determined that your program uses no parallelism — that it's a "serial" (non-parallel) program — then here's how to compile.

Choose the compiler that you want.
Try the recommended compiler options, using the EXACT SAME COMPILER OPTIONS for EVERY SINGLE COMPILE COMMAND in your compilation procedure (including the link command).

For example:

ifort -O -march=core2 -mtune=core2 -c mysourcefile1.f90
ifort -O -march=core2 -mtune=core2 -c mysourcefile2.f90
...
ifort -O -march=core2 -mtune=core2 -c mysourcefileN.f90
ifort -O -march=core2 -mtune=core2 -c mymainprogram.f90
ifort -O -march=core2 -mtune=core2 -o myexecutable \
mymainprogram.o mysourcefile1.o ... mysourcefileN.o

NOTE: The backslash \ at the end of the first line of the final compile (link) command is the Unix/Linux continuation character: it means that the command continues onto the next line. You don't absolutely need it — you can just type the entire link command on a single line — but it makes the link command more readable, which is good.
You're also welcome to try other compiler options, as described in the compiler manuals and man pages (online manuals).
If you have several source files, we STRONGLY RECOMMEND creating a library archive file.

For example:

ifort -O -march=core2 -mtune=core2 -c mysourcefile1.f90
ifort -O -march=core2 -mtune=core2 -c mysourcefile2.f90
...
ifort -O -march=core2 -mtune=core2 -c mysourcefileN.f90
ifort -O -march=core2 -mtune=core2 -c mymainprogram.f90
ar ru libmysource.a mysourcefile1.o ... mysourcefileN.o
ranlib libmysource.a
ifort -O -march=core2 -mtune=core2 -o myexecutable \
mymainprogram.o -L. -lmysource

NOTE: The backslash \ at the end of the first line of the final compile (link) command is the Unix/Linux continuation character: it means that the command continues onto the next line. You don't absolutely need it — you can just type the entire link command on a single line — but it makes the link command more readable, which is good.

NOTE: The ar command creates the library archive file libmysource.a, containing the binary versions of all of the routines in all of the object files (mysourcefileK.o for K from 1 to N).

NOTE: The ranlib command makes the library archive file libmysource.a smarter about its own contents, so we STRONGLY RECOMMEND using it.

NOTE: In the final compile (link) command, the -L. option means: "Look for library archive files named lib*.a or lib*.so* in the current working directory."

NOTE: In the final compile (link) command, the -lmysource option means: "Link to the library archive file named libmysource.a (if linking statically) or libmysource.so* (if linking dynamically)."

Note that the second character is a lower case L, NOT a one.

For example, if you write C or C++ code, you may have used the following:

-lm

which means "link to the C/C++ standard math library," which includes functions such as sqrt.
NOTE: When compiling (actually linking) with the Intel compiler family, a common warning is:
```
                /usr/intel10/lib/libimf.so:
                warning: warning: feupdateenv is not implemented
                and will always fail
                
```
You may IGNORE this warning; to our knowledge, no one has ever been harmed by it.

Rumor has it that adding the following compiler option to ALL compile/link commands will make this warning go away:

-shared-intel

You can instead use the following:

-i-dynamic

However, Intel now deprecates (severely frowns on) the second option, so we recommend the first.
You may find it valuable to try compiling and running your code with each of the compiler families, and with various combinations of compiler options, to find the best combination for your code.
Please bear in mind that some codes can only be compiled by a subset of the compilers that are available on Sooner.

B. How to Compile an OpenMP Program

OpenMP is supported by the following compilers, using the associated compiler option for EVERY SINGLE COMPILE COMMAND in your compilation procedure (including the link command):

GNU compiler family (gfortran only):
-fopenmp
Intel compiler family (icc, icpc, ifort):
-openmp
Numerical Algorithms Group (NAG) Fortran 90/95 compiler (version 5.1)
NOT SUPPORTED AT ALL
Portland Group compiler family (pgcc, pgCC, pgf77, pgf90, pghpf):
-mp

Compiling an OpenMP program is exactly the same as compiling a non-parallel (serial) program, except that you add the appropriate OpenMP compiler option to EVERY compile/link command.

For example:

ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefile1.f90
ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefile2.f90
...
ifort -O -march=core2 -mtune=core2 -openmp -c mysourcefileN.f90
ifort -O -march=core2 -mtune=core2 -openmp -c mymainprogram.f90
ar ru libmysource.a mysourcefile1.o ... mysourcefileN.o
ranlib libmysource.a
ifort -O -march=core2 -mtune=core2 -openmp -o myexecutable \
mymainprogram.o -L. -lmysource

C. How to Compile a POSIX Threads (pthreads) Program

POSIX threads (pthreads) are supported by all of the compilers installed on Sooner.

To compile with POSIX threads (pthreads), all you need to do is, in the final link step of compiling, append the following at the very end of the link command:

-lpthread

This means, "link to the POSIX threads (pthreads) library."

Note that the second character is a lower case L, NOT a one.

For example, if you write C or C++ code, you may have used the following:

-lm

which means "link to the C/C++ standard math library," which includes functions such as sqrt.

D. How to Compile an MPI Program

1. Before Compiling an MPI Program

Before compiling an MPI program, you need to set some environment variables:

MPI_COMPILER

This environment variable indicates the family of compilers that you're using.

NOTE: MPIENV, an equivalent environment variable, is deprecated (frowned upon for future use). Please transition your builds and batch scripts to MPI_COMPILER instead of MPIENV.
MPI_INTERCONNECT

This environment variable indicates the communication hardware that you're using to send MPI messages between MPI processes.

NOTE: MPIDEV, an equivalent environment variable, is deprecated (frowned upon for future use). Please transition your builds and batch scripts to MPI_INTERCONNECT instead of MPIDEV.
MPI_VENDOR

This environment variable indicates a subcategory of communication that you're using; you only need to set this environment variable under a few certain specific circumstances.

A quick discussion of how to set environment variables is below.

MPI_COMPILER

This environment variable indicates the family of compilers that you're using.

OSCER supports multiple compiler families on Sooner (see above).
Possible values that you can set MPI_COMPILER to are:

intel or intel10

Either of these values indicates that you want to compile using the Intel compiler family, version 10.1 (the default version).
intel9

This value indicates that you want to compile using the Intel compiler family, version 9.1 (an older version).

NOTE: We STRONGLY RECOMMEND AGAINST using older compiler versions such as intel9, which is included for backward compatibility only and which we'll shut down at the earliest opportunity (so it may not even exist by the time you read this).
pgi

This value indicates that you want to compile using the Portland Group compiler family (version 7.1-7).
gnu or gcc

Either of these values indicates that you want to compile using the GNU compiler family (version 4.1.2).
nag
This value indicates that you want to compile using the NAG compiler (version 5.1) for Fortran 77/90/95 and the GNU compiler family (version 4.1.2) for C and C++.

NOTE: The NAG compiler family includes only a Fortran 90/95 compiler, which can also be used to compile Fortran 77 code, but for C and C++ the GNU compilers are used. (The nagf95 compiler is a Fortran-to-C translator that then invokes gcc, so these compilers should be fully compatible.)

MPI_INTERCONNECT

This environment variable indicates the communication hardware that you're using to send MPI messages between MPI processes.

Possible values you can set MPI_INTERCONNECT to are:

ib

This value means that MPI messages will be sent over the high performance Infiniband interconnect, using the native Infiniband software drivers, known as MVAPICH. This is the fastest way to communicate among multiple nodes.

This is the default, and in general you should use this unless you have a VERY GOOD REASON to do otherwise (for example, if Infiniband is incompatible with the compiler you're using, as shown in the compatibility matrix below).
gige

This value means that MPI messages will be sent over Gigabit Ethernet, using the slower TCP/IP software driver that is used for standard Ethernet (e.g., for sending data on the Internet). This is a substantially slower way to communicate among multiple nodes, and should be used only if you have a very good reason (such as needing the PGI or NAG compilers).
shmem

This value means that MPI messages will be sent entirely inside RAM.

This value should only be used if your MPI executable is guaranteed to be run inside a single individual node (i.e., on at most 8 MPI processes).

MPI_VENDOR

This environment variable indicates some more specifics how to send MPI messages between MPI processes via Infiniband. The only possible value you can set MPI_VENDOR to is:

openmpi

This value means that MPI messages will be sent using OpenMPI, rather than MVAPICH.

If the MPI_VENDOR environment variable is not defined, but MPI_INTERCONNECT is set to ib, then MVAPICH will be used as the Infiniband communication mechanism.

2. Compiler/Interconnect Compatibility Matrix

COMPILER INTERCONNECT

ib
(Infiniband hardware via MVAPICH software) ib/openmpi (Infiniband hardware via OpenMPI software) gige

(Gigabit Ethernet hardware via TCP/IP software) shmem
(using RAM inside a single node)

gcc X X X X

g++ X X X X

g77 (soft linked to gfortran) X X X X

gfortran X X X X

icc X X X X

icpc X X X X

ifort X X X X

pgcc X X X X

pgCC X X X X

pgf77 X X X X

pgf90 X X X X

pghpf

nagf95 NOT YET NOT YET

E. How to Compile a CUDA Program

1. Before Compiling a CUDA Program

Before compiling a CUDA program, you need to set some environment variables:

CUDA_PATH

This environment variable indicates the location of the version of CUDA you're using.
PATH

This environment variable indicates the locaton(s) of exectuables that you're using.
LD_LIBRARY_PATH

This environment variable indicates the locaton(s) of libraries that you're using;

A quick discussion of how to the set environment variables is below.

CUDA_PATH

This environment variable indicates the location of the version of CUDA you're using.

OSCER supports multiple CUDA versions on Sooner (see above).
Possible values that you can set CUDA_PATH to are:

/home/software/CUDA/2.3/install/toolkit/cuda
/home/software/CUDA/3.0/install/cuda
/home/software/CUDA/3.2/install/cuda
/home/software/CUDA/4.0/cuda

NOTE: As newer versions of CUDA become available, older versions will be depricated and eventually removed from the system.

PATH

This environment variable indicates the locaton(s) of exectuables that you're using.

Simply edit the existing PATH for CUDA compliation by adding:

${CUDA_PATH}/bin:${CUDA_PATH}/open64/bin:${PATH}

NOTE: The "${CUDA_PATH}" means take the value of CUDA_PATH and place its value here.

LD_LIBRARY_PATH

This environment variable indicates the locaton(s) of libraries that you're using; The only possible value you can set LD_LIBRARY_PATH to is:

${CUDA_PATH}/lib:${CUDA_PATH}/open64/lib:${CUDA_PATH}/lib64:${LD_LIBRARY_PATH}

This value means the Nvidia CUDA Compiler will look for necessary libraries need for successful compilation in the indicated locations.

3. How to Set Environment Variables

How to set these environment variables depends on which Unix shell you are using. If you're not sure which Unix shell you're using (or if you're not sure what a "Unix shell" is), then type this at the Unix prompt and then press the "Enter" key:

ps

The response will look something like this:

  PID TTY          TIME CMD
28826 pts/19   00:00:00 tcsh
30567 pts/19   00:00:00 ps

If your shell is tcsh, then to set the environment variables, type this (perhaps with different values for the environment variables) at the Unix prompt:

setenv MPI_COMPILER intel
setenv MPI_INTERCONNECT ib

If your shell is bash or ksh or zsh, then to set the environment variables, type this at the Unix prompt:

export MPI_COMPILER=intel
export MPI_INTERCONNECT=ib

If your shell is sh, then to set the environment variables, type this at the Unix prompt:

MPI_COMPILER=intel
export MPI_COMPILER
MPI_INTERCONNECT=ib
export MPI_INTERCONNECT

If your shell is something else, then please contact us and we'll try to help.

4. How to Compile an MPI Program After You've Set the Environment Variables

After you've set the environment variables, compile using one of the following compile commands:

mpicc for C code

NOTE: Use this INSTEAD OF icc, gcc or pgcc.
mpiCC for C++ code

NOTE: Use this INSTEAD OF icpc, g++ or pgCC.
mpif77 for Fortran 77 code

NOTE: Use this INSTEAD OF ifort, g77, gfortran, nagf95, or pgf77.
mpif90 for Fortran 90 code

NOTE: Use this INSTEAD OF ifort, gfortran, nagf95, or pgf90.

In all other details, the usage is exactly the same as when compiling a non-parallel code. In particular, you will use the same compiler options as you would have used for the compiler family that you chose when setting MPI_COMPILER.

For example, to compile an MPI code written in Fortran 90 using the Intel compiler family, the compile command might start with:

mpif90 -O -march=core2 -mtune=core2 ...

WARNING! Be sure that EVERY SINGLE COMPILE COMMAND in your compilaton procedure, including the final compile command for linking the executable, uses the EXACT SAME COMPILER OPTIONS!

Running

I. DON'T RUN INTERACTIVELY!

Except for very small, very brief non-MPI runs on a few CPU cores within a single node, ALL runs should be performed using the LSF batch queue system.

II. Run Using the LSF Batch System

To start, you'll want to make a copy of one of the following batch script files:

~hneeman/example_nonparallel.bsub

Use this batch script file to run any batch job that uses EXACTLY ONE CPU CORE inside a single compute node (that is, a non-parallel job).
~hneeman/example_parallel_sharedmem.bsub

Use this batch script file to run any batch job that uses shared memory parallelism via either OpenMP or POSIX threads (pthreads) on up to 8 CPU cores inside a single compute node.
~hneeman/example_parallel_mpi.bsub

Use this batch script file to run any batch job that uses purely distributed parallelism via MPI on any number of nodes and any number of CPU cores per node (up to 8).
~hneeman/example_parallel_hybrid.bsub

Use this batch script file to run any batch job that uses a hybrid of two kinds of parallelism: (a) distributed parallelism via MPI on any number of nodes and any number of CPU cores per node (up to 8, though we recommend an upper limit of 4), and (b) shared memory parallel via either OpenMP or POSIX threads (pthreads). We recommend that the total number of threads per node be at most 8, because you almost certainly don't want more threads than cores per node.
~hneeman/example_fat.bsub

Use this batch script file to run any nonparallel (serial) or shared memory parallel batch job that uses 1 to 16 CPU cores inside a single fat node. The batch job can be either (a) non-parallel (serial) or (b) shared memory parallel via either OpenMP or POSIX threads (pthreads).
~hneeman/example_nonparallel_cuda.bsub

Use this batch script file to run any nonparallel cuda batch job that uses 1 CPU core inside a single node but could use up to two CUDA devices.
~hneeman/example_parallel_mpi_cuda.bsub

Use this batch script file to run any parallel cuda batch job that uses multiple CPU cores inside a multiple nodes AND uses multiple CUDA devices.

Once you've identified which batch script file to use, do this:

cp  ~hneeman/example_nonparallel.bsub  whatever.bsub

where whatever.bsub is the name of the batch script file that you want to create; typically, you'll replace whatever with the name of your executable or the experiment or something.

Note that you could use other of the example_*.bsub files instead of example_nonparallel.bsub.

You should be able to modify your copy of the batch script file to suit your needs. It contains detailed information about how to set up and run a batch job.

IMPORTANT NOTE!!!
In your batch script, you MUST use the absolute FULL PATH for your executable! DON'T use a relative path nor leave out the path!

Quick & Dirty Introduction to Compiling and Running on Sooner

Compiling

I. Determine What Kind of Parallelism Your Program Uses, If Any

A. Determine Whether Your Program Uses MPI

B. Determine Whether Your Program Uses OpenMP

C. Determine Whether Your Program Uses POSIX Threads (pthreads)

II. Compilers and Options

A. Compilers Available on Sooner

B. Recommended Compiler Performance Optimization Options

III. How to Compile

A. How to Compile a Non-Parallel Program

B. How to Compile an OpenMP Program

C. How to Compile a POSIX Threads (pthreads) Program

D. How to Compile an MPI Program

E. How to Compile a CUDA Program

Running

I. DON'T RUN INTERACTIVELY!

II. Run Using the LSF Batch System

III. Running a MATLAB Application

IV. Debugging using Totalview debugger

V. Running X Windows applications