[ Objetivos ] [ Programa
] [ Organización ] [ Matrícula
] [ Evaluación ]
| Lunes, 13. | Alpha: A Functional Data-Parallel Language |
| Martes, 14. | Parallel Programming with Skeletons: Theory and Practice |
| Miércoles, 15. | The High Performance Fortran Compilation System ADAPTOR |
| Jueves, 16. | Parallel Programming: not as hard as you might think |
| Viernes, 17. | The PRAM programming Language Fork. |




The polyhedral model provides a unified framework for reasoning about massively parallel, regular computations. Originally developed in the context of the automatic synthesis of systolic arrays from systems of affine recurrence equations, it now also provides a quantitative foundation for automatic parallelization of loops.
Alpha is a functional data-parallel language that embodies the polyhedral
model. An Alpha program is described essentially by
equations involving variables defined on multi-dimensional domains.
By successive transformations, the program may be refined until it can
be interpreted either as an architecture or as an imperative program.
In this talk I will first present the foundations of the language (semantics,
normalization and transformations of Alpha programs). I will then
describe the analysis methods of the polyhedral model, and how they may
be used to compile Alpha to efficient imperative programs for sequential
and parallel machines.
The current trend in software industry is towards program components that can be composed to form large applications. In the parallel programming, the role of components can be taken by skeletons -- common patterns of parallelism. Skeletons come as carefully defined, reusable, parameterized programming forms with pre-packaged implementations on different parallel architectures. This tutorial will give an overview of the current research in the field, with the following topics covered:
High Performance Fortran (HPF) has been defined in
1992 to become a new standard for data parallel programming. It was defined
to overcome the difficulties of parallel programming by extending the Fortran
language by some features for data parallel programming. This talk gives
a short summary of the HPF language and compares this kind of parallel
programming
with the two main trends of parallel programming: OpenMP for shared-memory
architectures and the MPI message-passing model for distributed-memory
systems.
The public domain HPF compilation system ADAPTOR
has been developed at the SCAI institute of GMD during the last years.
The design of the system is outlined and the functionality is compared
with currently available commercial HPF compilers. For some typical examples
it will be shown how ADAPTOR compiles HPF programs to equivalent MPI and
OpenMP programs.
Our experiences in porting industrial applications
to HPF show the applicability and benefits of this language and the performance
results prove that a performance close to hand-coded message-passing programs
can be achieved also for irregular problems.
In principle, parallel computers can deliver performance way beyond the performance delivered by uniprocessor computers. Yet, they are not widely used, except maybe by some highly specialized institutes. One of the reasons is that parallel programming is thought to be difficult. In this talk I will dive into history in order to explain some of the difficulties associated with parallel programming. The focus will be on message-passing multicomputers. In the old days, each parallel program had to be carefully matched to the topology of the parallel computer it was supposed to run on. Only processors which were physically adjacent could communicate with each other. Researchers and industrial parties soon realized that this was not the right track and began to offer point-to-point communication between any pair of processors. Although much more convenient, it can still lead easily to malicious errors such as deadlock, i.e., the phenomenon that a processor P is waiting to receive a message from another processor Q, while Q is waiting to receive something from P. The conclusion is that both of them will wait forever.
One of the reasons that deadlocks are so easily introduced is that communication
is two-sided, i.e., for each send operation, a
matching receive has to be issued on the destination processor. If
a receive is forgotten or if they do not appear in exactly the
right order, it is likely the program will show ``unexpected''behavior.
The Bulk-Synchronous Parallel (BSP) programming model avoids these problems
by making all communication one-sided.
There are no (explicit) receive operations. However, there still has
to be a way to signal to a processor that data has arrived.
For this, BSP uses barrier synchronizations: points in the program
that all processors must reach before the computation may
continue. It will be shown by some examples that this makes parallel
programming much easier.
BSP is more than just a programming model. It also provides a cost model
which makes it relatively easy to determine the
cost of executing a parallel program. The question that will be addressed
in the last part of this talk will be: How good
is BSP at predicting performance?
The Parallel Random Access Machine (PRAM) is a well-known
parallel programming model that denotes a synchronous MIMD parallel computer
with a sequentially consistent shared memory. It is very popular in the
design and analysis of parallel
algorithms, because it completely abstracts from the cost of
interprocessor communication and hence allows to solely focus on
exploiting fine-grained parallelism and load balancing. Although the
PRAM was often claimed to be unrealizable, a very powerful PRAM variant
(Combining Priority CRCW PRAM) with 2048 processors (from the programmer's
point of view) is currently being built in hardware by the SB-PRAM group
of Prof. Wolfgang Paul at the University of Saarbrucken, Germany; a SB-PRAM
pre-prototype with 512 processors is already operational.
In this talk we present Fork, a parallel programming language
for the PRAM model. Fork has been implemented for the SB-PRAM. Thanks to
the availability of a fast software simulator, Fork programs can be run
on Solaris workstations as well.
Together with the simulator, Fork is an ideal tool for experimenting
with parallelism and a useful vehicle for teaching a first course in parallel
programming. Also, Fork is used by researchers in parallel algorithmics
as a tool for the verification and comparison of the practicality of PRAM
algorithms. Fork makes the powerful features of the PRAM programming
model,
namely the shared memory, globally synchronous execution, and deterministic
concurrent write conflict resolution, transparent
to the programmer at the language level. Hence, Fork allows for the
straightforward implementation of practically any parallel
algorithmic paradigm. We introduce the main design concepts of Fork
and illustrate them with several example programs.
We also shortly discuss the implementation of Fork on
the SB-PRAM and the main problems of porting it to asynchronous architectures.
Finally, we give an outlook to two successor languages, ForkLight (for
the Asynchronous PRAM model) and NestStep (for the BSP model of parallel
computation).