|
Parallelizing FORTRAN 77 codes is a non-trivial process. As parallel
computers approach new levels of complexity and performance, the task
of program conversion becomes almost impossible without the aid of a
software tool like BERT.
BERT is designed to answer many of the difficult questions
concerning your program. All to often, users will ask; "Can it
run in parallel ?". Unfortunately, the answer to this question
will not provide any insights into an efficient parallelization. The
better questions is; "Should it run in parallel ?"
The following questions and answers provide an overview of BERT for
the parallel computers.
Can my program run in parallel ?
The concept of parallel computing offers exciting possibilities.
Accessing any new technology requires asking the right questions.
Parallel computing is no different.
As with any new technology, there are often incorrect "notions"
about what is possible and what is not possible. One such notion is
that almost unlimited parallel computing is possible by simply
connecting large amounts of processors. Ask any one who has used a
parallel computer and you will find that this is just not true.
Another incorrect notion is "anything that can be run in
parallel should be run in parallel". More often than not, a
FORTRAN 77 program is converted for a message passing parallel
environment, either by hand or with some tool, and the user is
surprised to find the performance gain is not what is expected. Only
after careful investigation and experimentation can the program be
converted with good results.
So what is the problem ? Well, for one thing, some software tools
and experienced people can do a pretty good job of determining
concurrent portions of a program. Concurrent portions are those parts
of a program that are independent.
The problem is whether a concurrent portion of the program should be
executed on multiple processors. The simple question "Can it run
in parallel ?" now becomes "Should it run in parallel ?"
Should my program run in parallel ?
If communication time is negligible compared to computation time,
then all concurrent portions should execute in parallel. In reality,
communication time is often not negligible. Because communication time
plays an important role in parallel performance, the user is forced to
ask the question:
"Should this concurrent portion be executed on multiple nodes ?"
The answer is not simple to determine. It requires knowledge of
communication time, computation time, and the number of processors.
Conversion tools that ignore this question have solved half the
problem. Making all concurrent parts of a program parallel may
actually decrease performance. Of course, there are some obvious
parallel FORTRAN codes, but other codes may not have obvious
parallelism and even with "concurrency detectors" users must
resort to a trial and error process to determine a good
parallelization.
Finally, consider the problems of moving a parallel program from one
platform to another. We all know parallel codes may be run on
workstation clusters, but is a parallelization for workstations the
best one for a dedicated parallel machine ? Or will a parallelization
done for a parallel computer work better with additional nodes or an
upgraded machine ?
These can be difficult and expensive questions to answer.
I do not think automatic tools create efficient codes.
Why should I use such a tool ?
There is no substitute for an experienced (and expensive)
programmer, but there are tools that can help reduce the effort
required for parallelizing FORTRAN programs. In fact, as you will
learn, a tool like BERT can make determinations about your program
that would be nearly impossible to do by hand.
If you have no experience with parallel programming, BERT can
examine your program and produce a parallelization quickly. Further
optimizations can be done using BERT as a guide.
BERT will leave your original source code unmodified (except for
comments). Optimizations are done within the context of your original
source code. You can make changes in your source program, analyze it
with BERT, and determine if your change will lead to better
parallelizations - without running on a parallel computer!
What does BERT do exactly?
BERT is designed to automatically convert FORTRAN 77 to message
passing FORTRAN programs. It is a "source to source"
conversion tool. How automatic the conversion is depends upon how the
original program was written. There is no software tool that can do a
fully automatic conversion for all FORTRAN programs.
Why is BERT different from other conversion tools?
The BERT analysis uses communication time and computation time to
determine the best method for parallelizing your program. Other tools
often do not consider the effect of these times. BERT will identify
parallelism and it will only schedule parallelism if it is expected to
produce a faster program. One thing to remember "concurrent does
not mean parallel".
Because BERT includes communication time, it is possible to give a
performance estimate for the conversion. With this information, it is
possible to determine what parts of your program will give the best "parallel
payoff".
What do you mean "concurrent does not mean parallel" ?
Using our definition, concurrent means that two or more things can
be done independently, but not necessarily on different processors.
Multiple users on a workstation constitute a concurrent operation.
Parallel means that two or more things can be done independently at
the same time on different processors. Just because something is
concurrent does not mean that you can increase the program speed by
making it parallel. BERT will examine your program and schedule those
concurrent sections that can be economically executed in parallel.
Does communication time determine whether something that is concurrent is
scheduled as parallel ?
Yes, but you must also consider the computation time. Because
communication time and computation time vary from machine to machine,
BERT will convert and schedule your program based on these parameters.
This information is contained in a special file (fun.std) which is
determined for each host parallel computer. Perhaps what is most
interesting is that, depending upon communication and computation
time, BERT will create different programs for different machines. "What
is parallel for some may not be parallel for all."
Do you mean if I spend the time to convert a program by hand for a cluster
of workstations using PVM and then move it to an parallel computer that
it may not be efficient ?
Exactly! And, the converse is true; if you convert a program for a
parallel computer, that conversion may not be best for another
environment. It is a very difficult task to consider all the
parameters that will determine a good parallelization. Because
parallel environments are so different, it is almost an impossible
task to do by hand. Furthermore, there are various parallelization
models that one could consider that will also have an effect on
performance.
What do you mean by parallelization models?
There are several ways to solve a problem in parallel. For instance,
the most common "static" method is to spread a large array
across several processors and perform independent computations on each
processor. Depending on the program, there are other methods that may
produce better performance if a "dynamic" model is used
instead of a static mode. Again, which model is best depends upon
communication time and computation time. More information about
parallelization modelsis available.
Can BERT help me determine the best model?
Yes. BERT supports both a static/data parallel model and a
dynamic/data flow model. Actually, version 1.1 of BERT will support
multiple models within the same program.
What about standards? Good question. First, as a user of BERT, you
will continue to interact with your FORTRAN program as you have in the
past. Your original program remains intact. You or BERT may provide
parallelization hints about your program, but these are only in the
form of comments. By maintaining your original program as FORTRAN 77,
BERT can decide the best conversion for the current environment.
For message passing, BERT currently supports PVM and some host
specific libraries. Recall, however, that just because BERT has
produced a PVM version of your program for one environment, this is no
guarantee that this program will run optimally in a different
environment. This is why a parallel program that is said to be "standard
PVM" is misleading. It will run on several environments, but it
may not run efficiently on all of these.
It seems that BERT is a good tool because the communication times may be
quite different from machine to machine?
You are absolutely correct. BERT can provide insights into the best
parallelization for a specific machine that may be difficult to
determine otherwise. It is possible that a parallelization produced by
BERT on one machine may not be efficient for another machine.
Why is BERT different from other FORTRAN compilers?
There are several differences. BERT is a "source-to-source"
translation tool. Source code produced by BERT must be compiled by a
FORTRAN compiler. The most important difference between BERT and
FORTRAN compilers is that BERT does interprocedural analysis. This
analysis allows automatic parallelization to be performed - including
loops with function calls.
Another major difference is parallelization models. BERT currently
supports a message passing environment and two parallelization models
- dynamic and static. Depending on the application, one model may be
more beneficial than the other. Having both models at your disposal
allows the best parallelization to be determined for your application.
Is BERT available now ?
Yes, please see the down load page.
|