12.4 Serial Polyadic DP Formulations
Floyd's algorithm for determining the shortest paths between all pairs of nodes in a graph can be reformulated as a serial polyadic DP formulation.
12.4.1 Floyd's All-Pairs Shortest-Paths Algorithm
Consider a weighted graph G, which consists of a set of nodes V and a set of edges E. An edge from node i to node j in E has a weight ci, j. Floyd's algorithm determines the cost di, j of the shortest path between each pair of nodes (i, j) in V (Section 10.4.2). The cost of a path is the sum of the weights of the edges in the path.
Let be the minimum cost of a path from node i to node j, using only nodes v0, v1, ..., vk-1. The functional equation of the DP formulation for this problem is
Equation 12.6

Since is the shortest path from node i to node j using all n nodes, it is also the cost of the overall shortest path between nodes i and j . The sequential formulation of this algorithm requires n iterations, and each iteration requires time Q(n2). Thus, the overall run time of the sequential algorithm is Q(n3).
Equation 12.6 is a serial polyadic formulation. Nodes can be partitioned into n levels, one for each value of k. Elements at level k + 1 depend only on elements at level k. Hence, the formulation is serial. The formulation is polyadic since one of the solutions to requires a composition of solutions to two subproblems and from the previous level. Furthermore, the dependencies between levels are sparse because the computation of each element in requires only three results from the preceding level (out of n2).
A simple CREW PRAM formulation of this algorithm uses n2 processing elements. Processing elements are organized into a logical two-dimensional array in which processing element Pi,j computes the value of for k = 1, 2, ..., n. In each iteration k, processing element Pi,j requires the values , , and . Given these values, it computes the value of in constant time. Therefore, the PRAM formulation has a parallel run time of Q(n). This formulation is cost-optimal because its processor-time product is the same as the sequential run time of Q(n3). This algorithm can be adapted to various practical architectures to yield efficient parallel formulations (Section 10.4.2).
As with serial monadic formulations, data locality is of prime importance in serial polyadic formulations since many such formulations have sparse connectivity between levels.
|