11.1 Definitions and Examples

A discrete optimization problem can be expressed as a tuple (S, f). The set S is a finite or countably infinite set of all solutions that satisfy specified constraints. This set is called the set of feasible solutions. The function f is the cost function that maps each element in set S onto the set of real numbers R.

The objective of a DOP is to find a feasible solution x_opt, such that f (x_opt) f (x) for all x S.

Problems from various domains can be formulated as DOPs. Some examples are planning and scheduling, the optimal layout of VLSI chips, robot motion planning, test-pattern generation for digital circuits, and logistics and control.

Example 11.1 The 0/1 integer-linear-programming problem

In the 0/1 integer-linear-programming problem, we are given an m x n matrix A, an m x 1 vector b, and an n x 1 vector c. The objective is to determine an n x 1 vector x whose elements can take on only the value 0 or 1. The vector must satisfy the constraint

and the function

must be minimized. For this problem, the set S is the set of all values of the vector x that satisfy the equation Ax b.

Example 11.2 The 8-puzzle problem

The 8-puzzle problem consists of a 3 x 3 grid containing eight tiles, numbered one through eight. One of the grid segments (called the "blank") is empty. A tile can be moved into the blank position from a position adjacent to it, thus creating a blank in the tile's original position. Depending on the configuration of the grid, up to four moves are possible: up, down, left, and right. The initial and final configurations of the tiles are specified. The objective is to determine a shortest sequence of moves that transforms the initial configuration to the final configuration. Figure 11.1 illustrates sample initial and final configurations and a sequence of moves leading from the initial configuration to the final configuration.

Figure 11.1. An 8-puzzle problem instance: (a) initial configuration; (b) final configuration; and (c) a sequence of moves leading from the initial to the final configuration.

graphics/11fig04.gif

The set S for this problem is the set of all sequences of moves that lead from the initial to the final configurations. The cost function f of an element in S is defined as the number of moves in the sequence.

In most problems of practical interest, the solution set S is quite large. Consequently, it is not feasible to exhaustively enumerate the elements in S to determine the optimal element x_opt. Instead, a DOP can be reformulated as the problem of finding a minimum-cost path in a graph from a designated initial node to one of several possible goal nodes. Each element x in S can be viewed as a path from the initial node to one of the goal nodes. There is a cost associated with each edge of the graph, and a cost function f is defined in terms of these edge costs. For many problems, the cost of a path is the sum of the edge costs. Such a graph is called a state space, and the nodes of the graph are called states. A terminal node is one that has no successors. All other nodes are called nonterminal nodes. The 8-puzzle problem can be naturally formulated as a graph search problem. In particular, the initial configuration is the initial node, and the final configuration is the goal node. Example 11.3 illustrates the process of reformulating the 0/1 integer-linear-programming problem as a graph search problem.

Example 11.3 The 0/1 integer-linear-programming problem revisited

Consider an instance of the 0/1 integer-linear-programming problem defined in Example 11.1. Let the values of A, b, and c be given by

graphics/11fig05.gif

The constraints corresponding to A, b, and c are as follows:

graphics/11fig06.gif

and the function f (x) to be minimized is

Each of the four elements of vector x can take the value 0 or 1. There are 2⁴ = 16 possible values for x. However, many of these values do not satisfy the problem's constraints.

The problem can be reformulated as a graph-search problem. The initial node represents the state in which none of the elements of vector x have been assigned values. In this example, we assign values to vector elements in subscript order; that is, first x₁, then x₂, and so on. The initial node generates two nodes corresponding to x₁ = 0 and x₁ = 1. After a variable x_i has been assigned a value, it is called a fixed variable. All variables that are not fixed are called free variables.

After instantiating a variable to 0 or 1, it is possible to check whether an instantiation of the remaining free variables can lead to a feasible solution. We do this by using the following condition:

Equation 11.1

The left side of Equation 11.1 is the maximum value of that can be obtained by instantiating the free variables to either 0 or 1. If this value is greater than or equal to b_i , for i = 1, 2,..., m, then the node may lead to a feasible solution.

For each of the nodes corresponding to x₁ = 0 and x₁ = 1, the next variable (x₂) is selected and assigned a value. The nodes are then checked for feasibility. This process continues until all the variables have been assigned and the feasible set has been generated. Figure 11.2 illustrates this process.

Figure 11.2. The graph corresponding to the 0/1 integer-linear-programming problem.

graphics/11fig08.gif

Function f (x) is evaluated for each of the feasible solutions; the solution with the minimum value is the desired solution. Note that it is unnecessary to generate the entire feasible set to determine the solution. Several search algorithms can determine an optimal solution by searching only a portion of the graph.

For some problems, it is possible to estimate the cost to reach the goal state from an intermediate state. This cost is called a heuristic estimate. Let h(x) denote the heuristic estimate of reaching the goal state from state x and g(x) denote the cost of reaching state x from initial state s along the current path. The function h is called a heuristic function. If h(x) is a lower bound on the cost of reaching the goal state from state x for all x, then h is called admissible. We define function l(x) as the sum h(x) + g(x). If h is admissible, then l(x) is a lower bound on the cost of the path to a goal state that can be obtained by extending the current path between s and x. In subsequent examples we will see how an admissible heuristic can be used to determine the least-cost sequence of moves from the initial state to a goal state.

Example 11.4 An admissible heuristic function for the 8-puzzle

Assume that each position in the 8-puzzle grid is represented as a pair. The pair (1, 1) represents the top-left grid position and the pair (3, 3) represents the bottom-right position. The distance between positions (i, j) and (k, l) is defined as |i - k| + | j - l|. This distance is called the Manhattan distance. The sum of the Manhattan distances between the initial and final positions of all tiles is an estimate of the number of moves required to transform the current configuration into the final configuration. This estimate is called the Manhattan heuristic. Note that if h(x) is the Manhattan distance between configuration x and the final configuration, then h(x) is also a lower bound on the number of moves from configuration x to the final configuration. Hence the Manhattan heuristic is admissible.

Once a DOP has been formulated as a graph search problem, it can be solved by algorithms such as branch-and-bound search and heuristic search. These techniques use heuristics and the structure of the search space to solve DOPs without searching the set S exhaustively.

DOPs belong to the class of NP-hard problems. One may argue that it is pointless to apply parallel processing to these problems, since we can never reduce their worst-case run time to a polynomial without using exponentially many processors. However, the average-time complexity of heuristic search algorithms for many problems is polynomial. Furthermore, there are heuristic search algorithms that find suboptimal solutions for specific problems in polynomial time. In such cases, bigger problem instances can be solved using parallel computers. Many DOPs (such as robot motion planning, speech understanding, and task scheduling) require real-time solutions. For these applications, parallel processing may be the only way to obtain acceptable performance. Other problems, for which optimal solutions are highly desirable, can be solved for moderate-sized instances in a reasonable amount of time by using parallel search techniques (for example, VLSI floor-plan optimization, and computer-aided design).