Beowulf Performance Suite V 1.3-1
March 22, 2005
Douglas Eadline Douglas@Eadline.org
www.basement-supercomputing.com

Purpose: 
========

This package is a collection of performance analysis programs for use 
with Beowulf clusters. The suite itself provides a graphical user 
interface for running the programs as well as html file generation of output.


Quick Start:
============


1) Install the rpm - "rpm -ivh <rpmfile>" 
   If the rpm fails dependencies due to missing packages
   add the packages and retry.
   (See below for more information.)
2) To run the NAS suite
   You will need MPICH_HOME set to your MPICH path,
   LAM_HOME set to your LAM-MPI path.  Also, if you wish to 
   use LAM-MPI, you will need to make sure LAM's bin path in your PATH 
   so that LAM can start on the nodes.
3) "man bps" and the README.bps file in the
    nas tar ball are your friends.

Important Notes:
================

The bps suite is best run as a user. Some of the tests (i.e. NAS parallel)
will not run as root.

Not all features of the command line interface are possible with the GUI.

When using Netpipe/Netperf Benchmarks, rsh with no password must be 
permitted between the nodes upon which the benchmark is to be run.
This behavior is typical of most clusters. 

Under normal operation, bps will always overwrite the existing log 
directory. You can use the -w option to prevent this from happening. 
In addition, copy previous log files (from older log directories) 
into the current log directory for bps-html conversion.

Also, the tests have been designed so that the bps rpm only needs to be
installed on the head node. For this to work, the bps log directory must me  
mounted on all nodes (i.e. under /home). 
 
The NAS tests were originally designed to work with LAM, MPICH, MPI/PRO 
MPIs and GNU, PGI, and Intel compilers.  Current versions of MPI/PRO, 
PGI, Intel packages have not been tested. This statement means they 
probably will not work.
                                                                                
If problems result when using the NAS Parallel Benchmarks, please
see the NAS documentation for more information. Normally issues involve 
running MPI/compiler configuration/linking issues. To make it as easy 
as possible, the benchmark scripts have been written to rely on the
two environment variables LAM_HOME and MPICH_HOME for LAM-MPI and 
MPICH. These variables MUST point to the appropriate MPI installation. 
If you are having problems with the NAS benchmarks, extract the 
npb.tar.gz archive in the/opt/bps/src directory and try running the scripts
manually. (You may also use the -k option to preserve the directory from 
which the NAS tests where run. This directory will be under your bps-log 
directory. Consult the README.bps file for more information.


Install Procedure:
==================

Using the rpm file:  
(version numbers may vary)

  rpm -i bps-1.3-1.i386.rpm

Using the source rpm file:
(Do this only if the rpm does not install on your system)

  rpm -i bps-1.3-1.src.rpm  (install src rpm)

  rpmbuild -bb bps.spec  (build the rpm)

  rpm -i /usr/src/redhat/RPMS/i386/bps-1.3-1.i386.rpm  (install the rpm)


Using the tarball:

 tar -xvzf <bps tarball>.tar.gz
 cd <bps dir>
 sh build-all

This will put all important files in ~bps/bin and ~bps/src. 


Usage:
======

bps
	run benchmarks included in bps from command line 

  Options:
    -b                            bonnie++
    -s                            stream
    -f <send node>,<receive node> netperf to remote node
    -p <send node>,<receive node> netpipe to remote node
    -n <compiler>,<#processors),  NAS parallel benchmarks
     <test size>,<MPI>,           compiler={gnu,pgi,intel}
     <machine1,machine2,...>      test size={A,B,C,dummy}
                   		  MPI={mpich,lam,mpipro}
    -k                            keep NAS directory when finished
    -u                            unixbench
    -m                            lmbench
    -l <log_dir>                  benchmark log directory
    -w                            preserve existing log directory
    -i <mboard manufacturer>,     machine information
       <mboard model>,<memory>
       <interconnect>,<linux ver>
    -v                            show version
    -h                            show this help

bps-html <log directory>

	generate html output files based on files in <log directory> 		


In Case of Problems:
====================

The BPS suite is a collection of many tests. You should have minimal or
no problems with the single machine tests. As more machines are involved
the tests, there is room for more configuration errors to arise.

If a test does not run the best thing to do is to check the "test_name.log"
file in the log directory. In the case of the NAS tests, the results are
in the form npb.COMPILER.MPI.CLASS.PROCESSORS.  In general, if you are
problems with a test it may be best to run it from the command line. In the
case of the NAS suite, the "-k" option will keep the npb directory
in the log directory so you can run the tests more directly by using 
the "run_suite" script in the npb directory. Also the README.bps
file in the npb directly should provide more information on how the tests
are run and how to resolve possible problems.


Background:
===========

General:
http://www.basement-supercomputing.com

bonnie++ - hard drive performance
Reference: http://www.coker.com.au/bonnie++/

stream - memory performance
Reference: http://www.cs.virginia.edu/stream/

netperf - general network performance
Reference: http://www.netperf.org/netperf/NetperfPage.html

netpipe - detailed network performance
Reference: http://www.scl.ameslab.gov/Projects/ClusterCookbook/nprun.html

unixbench - general Unix benchmarks
Reference: http://www.linuxdoc.org/HOWTO/Benchmarking-HOWTO.html#toc3

LMbench - low level benchmarks
Reference: http://www.bitmover.com/lmbench/

NAS - parallel tests
Reference: http://www.nas.nasa.gov/Software/NPB/

The following is a description of the NAS tests.

BT is a simulated CFD application that uses an implicit
  algorithm to solve 3dimensional (3D) compressible NavierStokes
  equations. The finite differences solution to the problem
  is based on an Alternating Direction Implicit (ADI) approximate
  factorization that decouples the x, y, and z dimensions.
  The resulting systems are BlockTridiagona/l of 5x5 blocks
  and are solved sequentially along each dimension.

SP is a simulated CFD application that has a similar structure
  to BT. The finite differences solution to the problem
  is based on a Beam Warming approximate factorization that
  decouples the x, y, and z dimensions. The resulting system
  has scalar Pentadiagonal bands of linear equations that
  are solved sequentially along each dimension.

LU is a simulated CFD application that uses symmetric successive
  over relaxation (SSOR) method to solve a seven block diagonal
  system resulting from finite difference discretization
  of the NavierStokes equations in 3D by splitting to into
  block Lower and Upper triangular systems.

FT contains the computational kernel of a 3D fast Fourier
  Transform (FFT)based spectral method. FT performs three
  one dimensional (1D) FFT's, one for each dimension.

MG uses a Vcycle MultiGrid method to compute the solution
  of the 3D scalar Poisson equation. The algorithm works
  continuously on a set of grids that are made between coarse
  and fine. It tests both short and long distance data movement.

CG uses a Conjugate Gradient method to compute an approximation
  to the smallest eigenvalue of a large, sparse, unstructured
  matrix. This kernel tests unstructured grid computations
  and communications by using a matrix with randomly generated
  locations of entries.

EP is an Embarrassingly Parallel benchmark. It generates
  pairs of Gaussian random deviates according to a specific
  scheme. The goal is to establish the reference point for
  peak performance of a given platform.

