Synthesis of Irregular Codes
Irregular codes are codes that manipulate any data structure more complicated than dense matrices, such as sparse matrices or recursive data structures. Very little is known about how irregular codes can be analyzed and restructured because the integer linear programming tools that work so well for regular codes are of little use in this context.
One approach to tackling this problem is to separate the specification of the algorithm from the specification of the data structure. If the algorithm can be coded in terms of an abstract data structure, it may be possible to analyze and restructure this high-level code before lowering the level of the code so that it uses the concrete data structure.
This idea is similar to generic programming in the sense that generic programming also encourages the separation of high-level algorithms from data structure implementations. The key difference is that in generic programming, the algorithm is described using an API that is supported directly by the concrete data structure. In our approach, we have two API's - one for the abstract data structure and one for the concrete data structure - and compiler technology is used to translate from the high-level API to the low-level one.
We have used this idea to address some difficult problems in the area of sparse matrix computations. Sparse matrices can be represented in many different formats, and even simple algorithms like matrix-vector product must be coded differently for each format. We have built a system that can be viewed as a library generator for sparse Basic Linear Algebra Subroutines (BLAS). We use dense matrices as our high-level abstract data structure. The interface to the (low-level) sparse formats is defined by an API that must be supported by designers of sparse matrix formats. The BLAS algorithms are coded in the system once and for all as though the matrices were dense.
To generate the sparse BLAS for a new sparse format, the user writes the format-specific code for supporting the low-level API, and then uses our compiler to generate a BLAS optimized for that format. We have shown that this approach generates efficient code for most commonly used formats.
The problem of generating efficient sparse code for matrix factorizations such as Cholesky and LU factorization appears to be much more difficult. It is also not known if this approach can be used for codes that deal with recursive data-structures such as trees and graphs.
Talks
- Relational Query Processing Approach to Compiling Sparse Matrix Codes Job Talk, 04/11/1998