Dynamic programming

Dynamic programming

Dynamic programming is used for solving many problems arising from operations research, which is why we will not list the tutorials related to this method.

Dynamic Programming is an exact method of problem solving
optimization, mainly due to R. Bellman (1957).

More precisely, dynamic programming is a paradigm for designing algorithms that can be applied to solve a problem if it meets Bellman's optimality.

Definition. Optimality of Bellman. A problem has the optimal substructure property if an optimal solution contains the optimal solution of the subproblems.

divide and conquer dynamic programming

Dynamic programming is similar in idea to the method divide and rule. The significant difference between these two methods is that dynamic programming allows subproblems to overlap. In other words, a subproblem can be used in the solution of two different subproblems. While the divide and conquer approach creates separate sub-problems that can be solved independently of each other.

The fundamental difference between these two methods is therefore: the subproblems in dynamic programming can be interacting, while in the divide and rule method they are not.

A second difference between these two methods is, as illustrated by the figure above, is that the divide and reign method is recursive, the recursion takes the global problem to split it into an elementary problem. Dynamic programming is a method whose calculations are done from the bottom up: we start by solving the smallest sub-problems. By combining their solution, we obtain the solutions of increasingly large subproblems.

Principle

The paradigm breaks down into four issues:

  1. Characterization of the structure of an optimal solution.
  2. Recursive definition of the value of the optimal solution.
  3. Ascending calculation of the solution.
  4. Construction of the solution from the ascending calculation.

We build a table to memorize the calculations already carried out: to each element will correspond the solution of one and only one intermediate problem, and one for the final solution. It is therefore necessary that one can determine the sub-problems which will be treated during computation.

There are two approaches to populating the table:

  • Iterative: We initialize the “boxes” corresponding to the base cases.
    One then fills the table according to a very precise order to be determined: one starts with the problems of “size” as small as possible, one finishes with the solution of the main problem: it is necessary that for each computation, one uses only the solutions already calculated.
  • Recursive: Same principle as the iterative approach, this method will only calculate what is strictly necessary to achieve the given objective.

Example: product of matrices

Be not matrices M1,…, Mnot, each matrix has a number mi of lines and mi + 1 columns, the entries are real numbers (linear problem). We seek to calculate M1*… * Mnot so as to minimize the number of operations.

We denote by cij the number of operations to calculate Mi*… * Mj. We then have cii= 0 and ci (i + 1)= m(i-1)* mi* m(i + 1) operations. Let us split this subproblem by calculating the best cij with Mi*… * Mk and M(k + 1)*… * Mj. Soij = min [cik + c(k + 1) j + mi* m(k + 1)* m(d + 1)] with k of i To d-1, the last term amounts to multiplying the results of the product of matrices of i To k with that of k + 1 To j.

So we have the following program:

  • vsij = 0 if i = j;
  • vsi (i + 1)= m(i-1)* mi* m(i + 1);
  • vsij = min [cik + c(k + 1) j + mi* m(k + 1)* m(d + 1)] otherwise.

The dynamic program table takes as input the number of operations performed according to the matrices chosen. Consider the following example:

i
1
2
3
4
5
6
7
mi
30
35
15
5
10
20
25

The initial table is as follows:

vsij
1
2
3
4
5
6
1
0
2
0
3
0
4
0
5
0
6
0

Then we can calculate cij with two matrices (without principle of substructure):

vsij
1
2
3
4
5
6
1
0
15750
2
02625
3
0750
4
01000
5
05000
6
 0

The rest of the table is filled diagonal by diagonal according to the rule described above:

vsij
1
2
3
4
5
6
1
0
15750
787593751187515125
2
026254375712510500
3
075025005375
4
010003500
5
05000
6
0

The result is obtained for i = 1 and j = 6, ie the box at the top right of the table. The minimum cost is therefore 15125 operations. A new question then arises: how did we proceed to have a minimum number of calculations?

When we calculate the minimum cost for each box, we choose from two configurations. For example to calculate c13, we take the minimum between c12* M3 and M1*vs23. It suffices to note the choice made in order to know the order of calculation of the matrix product.

Kij
1
2
3
4
5
6
1
1333
2
333
3
33
4
5
5
6

The table reads as follows: to calculate Mi* Mj, we set k = Kij given by the table then we calculate Mi*… * Mk and M(k + 1)*… * Mj that we then multiply between them.

In our example, to calculate vs16, we calculate c13*vs46; to calculate vs13, we calculate M1*vs23; to calculate vs46  we calculate c45* M6.

Most of algorithm based on dynamic programming retains in memory the choice made for each calculation. Often the result is not important, the journey to achieve it is.

Example: currency exchange

We want to change the change on £ 67. For this we want to use the minimum number of pieces of type: 1, 5, 10, 25.

Here it is easy to guess the optimal solution 67=2*25+10+5+2*1. By always choosing the coin with the highest possible value, we obtain a solution (by greedy algorithm).

The problem is written in the following way: Let D = {d1, .., dk} a finite number of coin value. It is assumed that each di is an integer and that the set is sorted by increasing value. Each coin value is available unlimited. The problem is to make the change on a value of n £ with a minimum number of coins, if dk = 1 then there is always a solution.

The greedy method does not always give an optimal solution. For example, D = {25,10,1} and n = 30. The optimal method will give the following solution: 25 + 5 * 1, which is worse than 3 * 10.

Step 1: Characterize the optimal substructure. Let us define C [j] as the optimal solution for the sum j £. We can thus remove a part and thus find an optimal solution for C [j] = 1 + C [j-di].

Step 2: Value of the optimal solution. We can recursively define the optimal solution from the substructure.

currency exchange dynamic programming

Step 3: Algorithm.

Coin-Changing (n, d, k) C [0] = 0; For j from 1 to n C [j] = inf; For i from 1 to k If j> = di and C [j-di] <C[j] then
               C[j]=1+C[j-di]
               Denom[j]=di
Return C

We use an additional array called Denom, such that Denom [j] represents the part to use to obtain an optimal solution for a sum j £. If we go up Denom by the value of the coin until we reach j = 1, then we know the coin selection that has been made. Let's take the example with the following pieces: 1, 5, 10, 25:

j0123456789101112131415
C [j]0123412345123452
Denom [j]01111511111011115

From a sum n £, we can find all the combinations of coins allowing to make the change. Consider the same coin values: 1, 5, 10, 25. For example, for N = 4, D = {1,2,3} there are four solutions: {1,1,1,1}, {1 , 1.2}, {2.2} and {1.3}.

Step 1: Optimal substructure, C (N, m) can be partition one two sets:

  1. Solutions containing no partsm
  2. solutions containing at least one piece ofm

If a solution does not contain part ofm, then we can solve the subproblem of N with D = {d1, .., dm-1}, that is to say the solutions of C (N, m-1).

If a solution contains dm, then we are going to remove a piece ofm, we must therefore solve the subproblem N- dm , with D = {d1, .., dm}. Let solve the following problem C (N- dm, m).

2nd step: The rule is: C (Nm) = + C (N- dm, m) with the basic conditions:

  • C (N, m) = 1, N = 0
  • C (N, m) = 0, N <0
  • C (N, m) = 0, N> = 1, m <= 0

Step 3: The solutions of the algorithms are reported in a table of the form

n / a123456789101112131415
1111111111111111
5111122222333334
10111122222444446
25111122222444446