The SIMD version presented here, has been implemented for the MasPar MP-2. The MasPar computer may be programmed with the languages MPL (MasPar Programming Language) as presented in Chapter 2, and the MP-Fortran (MasPar Fortran, F77 with some F90 and HPF-like syntax).
The MasPar is organized as a frontend/backend architecture (Figure
), where the DEC 3000 workstation provides
compiler support, network access, and starts applications, and the
MasPar backend is used for computation.
Figure: The general MasPar architecture
Since there are no C++ compilers for the MasPar, and the DEC C++
compiler does not cooperate well with the MPL's compiler and linker, an
AT&T version 3.1 cfront based C++ compiler was configured to produce
code that could run on the frontend
and send request to
the backend. This MasPar implementation, therefore, runs the C++
program on the frontend computer, from which the scope sends requests
to the MasPar backend for computation and storage of matrices.
The numerical library used is the MasPar Mathematical Library [] (MPML) which is designed for the MasPar. The MPML interface is somewhat different from BLAS and LAPACK, and does not have the flexibility, or the functionality offered by BLAS and LAPACK, but has both a MPL- and an MP-Fortran interface for some operations, although not for all operations.
This frontend/backend arrangement was necessary due to the difficulties that have been encountered when experiments were made to use MasPar Fortran subroutines from MPL and the frontend early in the project, while trying to use the BLAS implementation for the MasPar [] developed by researchers at the department. The conclusion of these attempts was that calling MasPar Fortran through its parallel interface was quite difficult, if not impossible. An alternative, calling through the Fortran 77 interface would, at the very least, have required a lot of complicated memory administration, since this interface required data to be located on the frontend. As MPML offered better and less complex means to handle the required operations, it was therefore selected.
The MasPar implementation as shown in Figure
, has
five layers: Application, SymPLA top level classes, two MasPar
implementation layers and the MPML library. A full description of this
implementation is available in the implementation
document [].
Figure: Organization of SymPLA for the MasPar
The MasPar implementation layers handle the frontend data required to identify the matrix, provide the necessary public information about it, do the initial preparations for the operation, and initiate the required functions required by the operation on the backend. The backend then uses the arguments sent to it to facilitate communication with the frontend to retrieve or send data to/from the frontend, and to use the MPML operations.
Direct use of submatrices are somewhat less advanced in this implementation than in the other versions, as some of the MPML functions cannot use submatrices, and almost none can use a transposed matrix.
The Matrixscope and Vectorscope classes in this version use a similar local storage method as the one used in the sequential version. The data pointer is, however, replaced with a long integer that holds the pointer to the backend structure object describing the backend storage of the matrix. This change of type was made, because it would set the pointer apart from frontend pointers.
class Matrixscope{
private:
struct matstore{
unsigned long matrixdescriptor;
int size_x,size_y;
int refs;
} *storage;
int *xsel,xselcount;
int *ysel,yselcount;
int *solverpermutation,permsize;
};
class Vectorscope{
private:
struct vecstore{
unsigned long vectordescriptor;
int size;
int refs;
} *storage;
int *sel,selcount;
};
struct dpumatrix{
int size_x,size_y;
int ldx;
plural double *matrixdata;
};
The MasPar implementation can call a number of MPL routines that are located and executed on the DPU backend of the MasPar system. These routines handle creation and destruction of the matrixdata that are handled by the DPU, assignment between matrices, setting and extracting data, and the arithmetic operations. The code is implemented in such a way that the operations use the entire matrix they are given. Therefore, the arguments must be copied to temporary matrices if the matrix is a submatrix or transposed. The same must be done if some of the operands of the operation may overlap the destination matrix, i.e., they are part of the destination matrix.
The general algorithm of an operation is as follows:
, as above,
then executes the operation. In case of an error, an error value is
returned, and the frontend throws an exception.If any step fails, the backend will return an error-code to the frontend, which will throw an exception.
Presently, exceptions can be handled only on a very low level with a number of macros and inline functions that provide the necessary functionality to report the exception and exit the application. This must be done, because the cfront compiler does not support exceptions.
Element function evaluation is implemented on the MasPar with the possibility of calling a function both on the frontend and the backend. When a function is called on the backend, that function is assumed to run in parallel for a number of different elements, and written in MPL.
As MasPar must use a special calling interface, the library needs to know if the function is on the frontend or the backend of the system. By using the array MatrixRemoteFunctions with MatrixRemoteFunctionCount addresses to functions, one may determine where the function shall run. If the function to be called is in the array, it must be called on the backend, otherwise the call must be handled on the frontend.
/* In MPL */
visible plural double func(plural double, plural int, plural int);
/* In C++ */
extern "C"{
double func(double,int,int);
}
double (*MatrixRemoteFunctions[])(double,int,int) = {func};
int MatrixRemoteeFunctionCount = 1;