Packages Used: No new packages are used in this notebook.
Heuristic procedures for problems in combinatorial optimization look at a small fraction of possible solutions to find a local optimum that may not be the global optimum; the solutions provided are optimal solutions, only
Heuristics are only needed when it is not feasible to otherwise find an optimal solution. In cases where realistic size instances a problem can be formulated as a MILP and easily solved, for example, the Set Covering problem, there is no need to consider the use of heuristics; but in the case of the Bin Packing, even relatively small instances of size 200 were difficult BIP to solve and thus, the use of a heuristic is helpful; also, in some cases like Bin Packing, the heuristic solution can be used as an initial incumbent solution to improve the effectiveness of the BIP.
Heuristics: The computational effort is split between
The degree to which a problem is constrained affects the relative impact of construction versus improvement procedures. The following are two extremes:
Easy but ineffective construction followed by effective improvement:
Hard but effective construction, little improvement possible:
General formulation of the 1-D bin packing problem (where the capacity restriction is assumed to be cubic volume):
$ \begin{eqnarray*} \quad M &=& \bigl\{ 1, \dots, m \bigr\}, \quad \mbox{objects to be packed into bins} \\ v_j &=& \mbox{volume of object } n \\ V &=& \mbox{volume of each bin } B_i \bigl( \max v_j \le V \bigr) \\ B^* &=& \mathrm{arg}\underset{B}{\operatorname{min}} \bigl\{ \lvert B \rvert : \sum_{j \in B_i} v_j \le V, \bigcup_{B_i \in B} = M \bigr\}, \quad \mbox{min cost bin packing of } M. \end{eqnarray*}$
Returns array where each element is the indices of objects from v
in a bin. Each element of v
is the weight of the object, and each bin is packed so that its total weight of objects does not exceed V
:
function firstfit(v,V)
# Put first object into first bin
B = [[1]] # Array of arrays
b = [v[1]] # Array of scalars
# Add remaining objects to bins
for i = 2:length(v)
idx = findfirst(b .+ v[i] .<= V) # Find first bin object fits into
if ~isnothing(idx) # Found existing bin
b[idx] += v[i]
push!(B[idx],i)
else # Create new bin for object
push!(b,v[i])
push!(B,[i])
end
end
return B
end
firstfit (generic function with 1 method)
using Random
m = 200
M = 1:m
V = 20
Random.seed!(57839)
v = rand(1:10,m)
LB = ceil(sum(v)/V)
B = firstfit(v,V)
56-element Vector{Vector{Int64}}: [1, 2, 3, 19] [4, 5, 6] [7, 8, 9, 10, 27] [11, 12, 15, 21] [13, 14, 22] [16, 17, 24, 28, 37] [18, 20, 26] [23, 25, 30, 48] [29, 31, 38] [32, 33, 40] [34, 35, 45] [36, 39, 43] [41, 42, 52, 59, 67] ⋮ [159, 161, 162] [163, 164, 165, 170] [166, 167, 171] [168, 169, 173] [172, 174, 184] [175, 176, 177, 187, 189] [178, 180, 181] [183, 185, 194] [188, 190, 191] [192, 193, 197] [195, 196, 198] [200]
LB, length(B)
(55.0, 56)
Since it is not possible to easily improve a constructed bin-packing solution, the only easy way to find alternate solutions is to include randomness by randomly re-ordering the items v
used by firstfit
multiple times and return the best-constructed solution found:
function multifirstfit(v,V)
m = length(v)
TC, B = Inf, 0
for i = 1:m
Bi = firstfit(v[randperm(m)],V)
if length(Bi) < TC
TC, B = length(Bi), Bi
end
end
return B
end
Random.seed!(46345)
LB, length(multifirstfit(v,V))
(55.0, 55)
Since the solution found by multifirstfit
equals the LB, it is the global optimal solution. In general, this will not be the case, and the solution found by multifirstfit
is just the best-known solution.
using Random
m = 2000
M = 1:m
V = 200
Random.seed!(1244765)
v = rand(1:100,m)
ceil(sum(v)/V), length(firstfit(v,V)), length(multifirstfit(v,V))
(514.0, 520, 518)
using Random
m = 50
M = 1:m
V = m
Random.seed!(23542344)
v = rand(1:m,m)
ceil(sum(v)/V), length(firstfit(v,V)), length(multifirstfit(v,V))
(29.0, 34, 32)
In the linear assignment problem (LAP) we looked at in Net Mod 1, the cost of making an assignment of i to k did not depend on any other j being assigned a different l; in the quadratic assignment problem (QAP), the cost of making an assignment of i to k depends on the assignment of all other j's to particular l's. The QAP can be formulated as the following binary integer program (BIP):
$ \begin{eqnarray*} \quad \textbf{QAP:} \quad \mbox{Minimize} \quad \sum_{i \in N} \sum_{j \in N} \sum_{k \in N} \sum_{l \in N} c_{ijkl} x_{ik} x_{jl} \\ \quad \mbox{subject to} \quad \sum_{i \in N} x_{ik} &=& 1 \mbox{,} &\quad k \in N \\ \sum_{k \in N} x_{ik} &=& 1 \mbox{,} &\quad i \in N \\ x_{ik} &\in& \bigl\{ 0, 1 \bigr\}, \end{eqnarray*} $
where
$ \begin{eqnarray*} \quad x_{ik} &=& \begin{cases} 1, & \mbox{if } i \mbox{ is assigned to } k \\ 0, & \mbox{otherwise} \end{cases} \\ c_{ijkl} &=& \mbox{cost of assigning } i \mbox{ to } k \mbox{ if } j \mbox{ is assigned to } l. \end{eqnarray*} $
The objective function is quadratic because of the $x_{ik} x_{jl}$ terms, and, as a result, is an example of a nonlinear programming problem.
In comparison, the following LAP formulation is a linear programming problem because of the single $x_{ij}$ term in the objecive:
$ \begin{eqnarray*} \quad \textbf{LAP:} \quad \mbox{Minimize} \quad \sum_{i \in N} \sum_{j \in N} c_{ij} x_{ij} \\ \quad \mbox{subject to} \quad \sum_{j \in N} x_{ij} &\le& 1\mbox{,} &\quad i \in N \\ \sum_{i \in N} x_{ij} &=& 1\mbox{,} &\quad j \in N \\ x_i &\ge& 0. \end{eqnarray*} $
The QAP can be used to model layout problems, where a set of resources are assigned to a set of sites. The major cost is the cost of moving an item from site to site as different items visit different resources. The items can be parts being fabricated or patients visiting stations in a clinic. The sequence of resources needs by each item is termed its routing, and each resource can be shared between several items, making the total cost (typically, total distance) dependant on the assignment (or layout) of the resources to the sites.
In a layout problem, the QAP cost term can be separated into two terms:
$ \begin{eqnarray*} \quad c_{ijkl} &=& \mbox{cost of assigning resource } i \mbox{ to site } k \mbox{ if resource } j \mbox{ is assigned to site } l \\ &=& w_{ij} d_{kl} \end{eqnarray*} $
where
$ \begin{eqnarray*} \quad w_{ij} &=& \mbox{total flow from resource } i \mbox{ to resource } k \\ &=& \sum_{p \in P} w_{ijp} \\ w_{ijp} &=& \mbox{total flow from } i \mbox{ to } j \mbox{ for item } p \\ d_{kl} &=& \mbox{distance from site } k \mbox{ to site } l. \end{eqnarray*} $
In this example, A, B, and C are three different types of items transferred between four machine resources, as shown below. The total number of each item to be produced per shift is 24, 10, and 12, respectively, and their routings from machine to machine are:
$ \quad A: 1–2–3–4; \quad B: 2–4–1–2–3; \quad C: 3–4–1–2–4 $
There are four possbile sites at which the machines can be located. Each site is represented as a circle and its cooridinate along 1-D is given:
The site-to-site distance matrix D
is:
dist1(x,p) = abs.(x .- p)
p = [0, 10, 35, 50]
D = vcat([dist1(i,p) for i in p]'...)
4×4 Matrix{Int64}: 0 10 35 50 10 0 25 40 35 25 0 15 50 40 15 0
A machine-to-machine flow matrix W
can be created using the number of items produced each shift f
and the routing rte[i]
for each item i
, where each routing is the sequence of machines visited by the item:
f = [24,10,12]
rte = [[1,2,3,4], [2,4,1,2,3], [3,4,1,2,4]]
3-element Vector{Vector{Int64}}: [1, 2, 3, 4] [2, 4, 1, 2, 3] [3, 4, 1, 2, 4]
The number of machines n
can be determined by finding the maximum index in the routings:
n = maximum([maximum(i) for i in rte])
4
The machine-to-machine flow matrix can calculate using zip
to iterate from machine j
to machine k
along each routing:
W = zeros(n,n)
for i in 1:length(f), (j,k) in zip(rte[i][1:end-1],rte[i][2:end])
@show j,k
W[j,k] += f[i]
end
W
(j, k) = (1, 2) (j, k) = (2, 3) (j, k) = (3, 4) (j, k) = (2, 4) (j, k) = (4, 1) (j, k) = (1, 2) (j, k) = (2, 3) (j, k) = (3, 4) (j, k) = (4, 1) (j, k) = (1, 2) (j, k) = (2, 4)
4×4 Matrix{Float64}: 0.0 46.0 0.0 0.0 0.0 0.0 34.0 22.0 0.0 0.0 0.0 36.0 22.0 0.0 0.0 0.0
Given n machines and sites, each element $ w_{ij} $ of W
represents some measure of the "weight" between machine i and machine j. The machine-to-site assignment vector $\alpha$ can be used to translate machine-to-machine weights to site-to-site weights so that they can be combined with the site-to-site distances to calculate to total cost TC of an assignement/layout. Given $\alpha(k) = i$ and $\alpha(l) = j$, the weight between site k and site l is equal to the following element of the machine-to-machine weight matrix W
: $ w_{\alpha(k) \alpha(l)} = w_{i,j} $. For a given α
, W
, and D
,
$ \begin{eqnarray} \quad TC = \sum_{k \in N} \sum_{l \in N} w_{\alpha(k) \alpha(l)} d_{kl} \end{eqnarray} $
represents the total cost of the assignment.
The assignment vector $\alpha = \big[ 3, 4, 2, 1 \big] $ represents the assignment (or layout) of machines (squares) to sites (circles):
The total cost of this assignment can be determined by rearranging the rows and columns of the machine-to-machine matrix W
so that they correspond to the site-to-site matrix D
:
[W D] # Error: W is machine-to-machine and D is site-to-site
4×8 Matrix{Float64}: 0.0 46.0 0.0 0.0 0.0 10.0 35.0 50.0 0.0 0.0 34.0 22.0 10.0 0.0 25.0 40.0 0.0 0.0 0.0 36.0 35.0 25.0 0.0 15.0 22.0 0.0 0.0 0.0 50.0 40.0 15.0 0.0
TC = sum(W.*D)
3830.0
α = [3,4,2,1]
[W[α,α] D] # W[α,α] is site-to-site and D is site-to-site
4×8 Matrix{Float64}: 0.0 36.0 0.0 0.0 0.0 10.0 35.0 50.0 0.0 0.0 0.0 22.0 10.0 0.0 25.0 40.0 34.0 22.0 0.0 0.0 35.0 25.0 0.0 15.0 0.0 0.0 46.0 0.0 50.0 40.0 15.0 0.0
TC = sum(W[α,α].*D)
3670.0
Given an initial machine-to-site assignment array α
, a simple heuristic is to determine which pairwise interchange of machines results in the greatest decrease in TC. If no interchange reduces TC, then keep the initial assignment; otherwise, make the pairwise interchange and then continue by looking at pairwise interchanges of the machines in the new assignment array.
The steepest descent pairwise interchange (SDPI) improvement procedure for QAP returns the total cost TC
and assignment α
of local optimum given initial assignment α
, machine-to-machine weight matrix W
, and site-to-site distance matrix D
:
function sdpi(αᵒ,W,D)
TCᵒ, n, cnt = sum(W[αᵒ,αᵒ].*D), length(αᵒ), 1
println(TCᵒ,": ",αᵒ)
done = false
while !done
done = true # Assume no improvement will be found
α₀ = copy(αᵒ) # Copy as current in case improvement found
for i = 1:n-1, j = i+1:n
cnt += 1
α = copy(α₀) # Work on copy of current assignment
α[[j,i]] .= α[[i,j]] # Pairwise interchange of i and j
TC = sum(W[α,α].*D) # Calc. TC of interchange
if TC < TCᵒ # Improvement found
done = false
TCᵒ, αᵒ = copy(TC), copy(α)
end
end
println(TCᵒ,": ",αᵒ) # Output best interchange found at each step
end
return (TC = TCᵒ, α = αᵒ, cnt = cnt) # Named tuple used to pass multiple outputs
end
sdpi (generic function with 1 method)
α = [3,4,2,1]
res = sdpi(α,W,D)
3670.0: [3, 4, 2, 1] 3670.0: [3, 4, 2, 1]
(TC = 3670.0, α = [3, 4, 2, 1], cnt = 7)
res = sdpi([4,3,2,1],W,D)
3770.0: [4, 3, 2, 1] 3670.0: [3, 4, 2, 1] 3670.0: [3, 4, 2, 1]
(TC = 3670.0, α = [3, 4, 2, 1], cnt = 13)
res.TC, res.α, res.cnt
(3670.0, [3, 4, 2, 1], 13)
TC = sdpi([4,3,2,1],W,D).TC
3770.0: [4, 3, 2, 1] 3670.0: [3, 4, 2, 1] 3670.0: [3, 4, 2, 1]
3670.0
In order to illustrate the operation of the SDPI heuristic, a "pairwise interchange graph" can be constructed for small (i.e., less than five machines/sites) problems. In such a graph, each node represents one of the n! possible assignment arrays, and each arc represents a pairwise interchange that transforms assignment into another via an interchange of the elements at two positions. Each node is connected via arcs representing all of the possible pairwise interchanges from this assignment. The value assigned to each node is the total cost TC. Although the global optimum assignment can be read off the graph for small problems, there would be too many nodes to make this feasible for larger problems.
In this example, the pairwise interchange graph has 4! = 24 nodes corresponding to all of the possible assignments of 4 machines to 4 sites. The arc from _a_1 to _a_2 represents the interchange of machines 3 and 4 located at sites 3 and 4. The global optimum assignment is _a_18, corresponding to TC = 3,670.
sdpi([2,1,3,4],W,D)
4170.0: [2, 1, 3, 4] 3830.0: [1, 2, 3, 4] 3680.0: [1, 2, 4, 3] 3680.0: [1, 2, 4, 3]
(TC = 3680.0, α = [1, 2, 4, 3], cnt = 19)
Determining a layout for the kitchen shown below corresponds to assigning each of the seven different appliances (the resources or "machines") to one of the sites in the kitchen. In the figure below left, the red dots in front of each site location correspond to the location in the kitchen at which a person would stand when using the appliance.
The sequence of appliances that are visited while preparing an estimated number of the meals per week (the routings) are as follows:
f = [25,10,7,2,6,8]
rte = [
[4,3],
[7,4,5,1],
[7,1,7,4,7,1],
[4,5,3,5,4,2,1],
[6,5,4,1,5,2,3,5,1],
[1,5,1,4,1,5]
]
6-element Vector{Vector{Int64}}: [4, 3] [7, 4, 5, 1] [7, 1, 7, 4, 7, 1] [4, 5, 3, 5, 4, 2, 1] [6, 5, 4, 1, 5, 2, 3, 5, 1] [1, 5, 1, 4, 1, 5]
Return weight matrix of flow volumes using routings rte
and their frequencies f
n = maximum([maximum(i) for i in rte])
W = zeros(n,n)
for i in 1:length(f), (j,k) in zip(rte[i][1:end-1],rte[i][2:end])
W[j,k] += f[i]
end
W
7×7 Matrix{Float64}: 0.0 0.0 0.0 8.0 22.0 0.0 7.0 2.0 0.0 6.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 8.0 0.0 0.0 14.0 2.0 25.0 0.0 12.0 0.0 7.0 24.0 6.0 2.0 8.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.0 0.0 0.0 14.0 0.0 0.0 17.0 0.0 0.0 0.0
P = [0 3.75; 0 6.25; 0 8.75; 7.25 11; 8.5 9.75; 8.5 2.25; 6 0]
dist2(x,P) = sqrt.(sum((x' .- P).^2, dims = 2))
D = vcat([dist2(P[i,:],P) for i in 1:size(P,1)]'...)
7×7 Matrix{Float64}: 0.0 2.5 5.0 10.253 10.4043 8.63134 7.07549 2.5 0.0 2.5 8.66747 9.19239 9.39415 8.66386 5.0 2.5 0.0 7.59111 8.55862 10.7005 10.6095 10.253 8.66747 7.59111 0.0 1.76777 8.83883 11.0708 10.4043 9.19239 8.55862 1.76777 0.0 7.5 10.0654 8.63134 9.39415 10.7005 8.83883 7.5 0.0 3.36341 7.07549 8.66386 10.6095 11.0708 10.0654 3.36341 0.0
using Random
Random.seed!(2342)
n = size(W,1)
α = randperm(n) # Construction procedure for QAP
7-element Vector{Int64}: 5 3 1 4 7 6 2
sdpi(α,W,D)
1255.4944111874643: [5, 3, 1, 4, 7, 6, 2] 1152.9991664443612: [3, 5, 1, 4, 7, 6, 2] 1122.9479368175016: [6, 5, 1, 4, 7, 3, 2] 1098.932213044291: [6, 5, 1, 7, 4, 3, 2] 1098.932213044291: [6, 5, 1, 7, 4, 3, 2]
(TC = 1098.932213044291, α = [6, 5, 1, 7, 4, 3, 2], cnt = 85)
sdpi(randperm(n),W,D)
1562.8851086708348: [7, 3, 6, 1, 4, 5, 2] 1367.6111038845675: [7, 3, 6, 1, 5, 4, 2] 1109.9684166763136: [7, 3, 4, 1, 5, 6, 2] 1081.760351370224: [3, 7, 4, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2]
(TC = 1029.1206697560654, α = [3, 4, 7, 1, 5, 6, 2], cnt = 106)
Top 10 layouts out of 7! = 5,040 total layouts, where each corresponds to a local optimal solution found by using SDPI:
Finally, a multistart procedure can be used to find the best of n
runs of sdpi
:
function multisdpi(W,D)
TCᵒ, αᵒ, n, cnt = Inf, [], size(W,1), 0
for i = 1:n
res = sdpi(randperm(n),W,D)
cnt += res.cnt
if res.TC < TCᵒ
(TCᵒ,αᵒ) = res
end
println("\n(",i,") ",res.TC,": ",res.α,"\n")
end
return (TC = TCᵒ, α = αᵒ, cnt = cnt)
end
Random.seed!(3242)
multisdpi(W,D)
1581.9485703229298: [2, 7, 3, 6, 5, 4, 1] 1376.9330930894175: [2, 7, 3, 1, 5, 4, 6] 1172.1095196161912: [4, 7, 3, 1, 5, 2, 6] 1085.3107995687442: [7, 4, 3, 1, 5, 2, 6] 1035.4524346477785: [3, 4, 7, 1, 5, 2, 6] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] (1) 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1489.8753286214676: [3, 7, 4, 1, 2, 6, 5] 1081.760351370224: [3, 7, 4, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] (2) 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1554.2721530109475: [2, 4, 1, 7, 3, 6, 5] 1229.4658013734322: [5, 4, 1, 7, 3, 6, 2] 1145.3925680365944: [5, 1, 4, 7, 3, 6, 2] 1087.7721363602977: [5, 1, 7, 4, 3, 6, 2] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] (3) 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1459.2775589172322: [5, 3, 6, 7, 2, 4, 1] 1229.4900696793095: [5, 1, 6, 7, 2, 4, 3] 1176.1707138496902: [5, 1, 7, 6, 2, 4, 3] 1132.1021943864253: [5, 1, 7, 6, 2, 3, 4] 1111.6113992764062: [7, 1, 5, 6, 2, 3, 4] 1111.6113992764062: [7, 1, 5, 6, 2, 3, 4] (4) 1111.6113992764062: [7, 1, 5, 6, 2, 3, 4] 1615.3523372826019: [5, 2, 7, 6, 3, 1, 4] 1293.7437797238326: [5, 1, 7, 6, 3, 2, 4] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] (5) 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1529.9808383940517: [5, 2, 1, 7, 3, 6, 4] 1229.4658013734322: [5, 4, 1, 7, 3, 6, 2] 1145.3925680365944: [5, 1, 4, 7, 3, 6, 2] 1087.7721363602977: [5, 1, 7, 4, 3, 6, 2] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] (6) 1069.376320423777: [5, 1, 7, 4, 3, 2, 6] 1485.577494203347: [7, 6, 3, 4, 5, 1, 2] 1283.5742058592323: [7, 4, 3, 6, 5, 1, 2] 1087.7686309323433: [7, 4, 3, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2] (7) 1029.1206697560654: [3, 4, 7, 1, 5, 6, 2]
(TC = 1029.1206697560654, α = [3, 4, 7, 1, 5, 6, 2], cnt = 658)