next up previous contents
Next: Parallel versions Up: Implementations Previous: Implementations

Sequential version

The sequential version is intended for use on a stand-alone machine, used for testing and debugging smaller problems, or when parallelism is not available.

This implementation uses BLAS and LAPACK to handle calculations, and has safe treatment of the data by extracting the matrix data needed into allocated memory, if necessary, otherwise the (sub)matrix is used directly. Then the operation is executed and if temporary memory is used, the results are written back into the original matrix. This method of implementation ensures that the handling of the matrices is data-safe, that is, no matrix element is overwritten before it has been used for the actual operation. On the other hand, this can cause a large overhead in some operations, where the number of operations is very small compared to the amount of data.

The four levels of the sequential version as shown in Figure gif, are the Application, defined by the user, the SymPLA Class Library discussed in Chapter 3, the Sequential Scope Class presented here, and the LAPACK library used for computation, which is third party software. A full description of this implementation is available in the implementation document [].

   figure1301
Figure: Organization of SymPLA on a sequential architecture

The sequential scope class layer implements the storage and administration of data, and the operations used to perform calculations for the SymPLA library. Most of the calculations are in turn performed using the appropriate BLAS and LAPACK routines, in the bottom layer.

The submatrix functionality is handled by having the scope hold all information about which rows and columns are selected, and let all the information about size and location of the original matrix be stored in a separate, shared structure called matstorage, as was shown in Chapter 3. This structure is pointed to by the member storage. The matstorage object can be shared by many submatrices. The last Matrixscope object destroyed will destroy the matstorage object (the last one may not be the original matrix). This is handled by the reference counter in the matstorage object, which is updated by the constructors and destructors of the scopes.

The following extracts show how the data-structures are organized:

  struct constant_temp_matrix{
    double *data1;
    const double *data;
    MatrixTranspose transposed;
    int allocated;
    int m,n,lda;
  };

  class Matrixscope{
    private:
      friend Vectorscope;

      struct matstore{
        double *matrixdata;
        int size_x,size_y;
        int refs;
      } *storage;
      int *xsel,xselcount;
      int *ysel,yselcount;
      int *solverpermutation,permsize;

      struct mtempdata{
        double *data;
        MatrixTranspose transposed,paratrans;
        int allocated;
        int m,n,lda;
      }tempmat,originaltemp;
  };



  struct constant_temp_vector{
    double *data1;
    const double *data;
    int allocated;
    int n;
  };

  class  Vectorscope{
    public:
      struct vecstore{
        double *vectordata;
        int size;
        int refs;
      } *storage;
      int *sel,selcount;

      struct vtempdata {
        double *data;
        int allocated;
        int n;
      }tempvec,originaltemp;
  };

When an operation in one of the scopes is called, the arguments are checked for validity, and the data from the arguments are, if necessary, loaded into temporary storage. The scope classes are able to handle arguments that are continuous block submatrices, that is, they are composed of all the elements in an unbroken, unpermuted and rectangular portion of the submatrix. In some cases, this is not enough, because the submatrix is transposed, and the operation cannot handle a transposition for that particular argument, i.e., the matrix multiplication. Another example occurs when two of the arguments are submatrices of the same matrix, and one of them is updated by the operation. In both of these special cases, as well as when the submatrix is permuted in some manner, the matrix must be copied to temporary storage.

This administration is handled by the operations preparetemp, unloadtemp and freetemp, which use the members tempmat and originaltemp of the structure mtempdata, and the structure constant_temp_matrix. A similar arrangement exists for vectors.

The member originaltemp is used to set up the necessary values and the pointer if the matrix is a continuous submatrix. This work is done by the scope's constructor. The preparetemp functions are then used to decide whether or not to use the specified block in the matrix, if it is available, based on the operations sensitivities. Otherwise, memory is allocated, and the submatrix is extracted into the temporary array. If the submatrix is a target, the function unloadtemp must be used before calling the function freetemp which, if necessary, deallocates any allocated memory. The function unloadtemp will, if the operation has used temporary memory, copy the (sub)matrix stored in the temporary memory back into the original matrix. Otherwise, it will do nothing.

This approach has some difficulties when the submatrix is an argument declared const, and which cannot be changed. This problem is solved by having a separate structure constant_temp_matrix, which has a pointer data to const double instead of double as the scope member tempmat. If the submatrix can be used directly, the data is set to point directly into the matrix. Otherwise, memory is allocated and handled by the member data1. This way, both the direct use of the submatrix, and the temporary matrix can be used with the same syntax, while preserving the integrity of the datagif.

When constant_temp_matrix is initialized, it is sent into a second preparetemp function with the otherwise same interface, and similarly for the freetemp.

Only the operations set and extract directly manipulate the scope's submatrix or subvector. All other functions use these operations to transfer data between the submatrix and the temporary memory allocated for the operation.

The general algorithm of the operations in the sequential version is as follows:

If any of the steps fails, the operation throws an exception.

Presently, exceptions can be handled only on a low level on computers supporting exception. If the compiler does not support exceptions, a number of macros and inline functions provide the necessary functionality to report the exception and exit the application.


next up previous contents
Next: Parallel versions Up: Implementations Previous: Implementations