Net Mod 2: Shortest Paths and CPM

Packages Used: Functions from the following packages are used in this notebook for the first time:

1. Network Representations

The terms network and graph can be used interchangeably. The terms graph/vertex/edge are typically used when the focus is on properties of the mathematical object, and the terms network/node/arc are used when the focus is on the properties of a real-work system. The terms graph and digraph (short for directed graph) are used in mathematics to describe networks in which the arcs are either all undirected or directed, respectively. In many network applications, both directed and undirected arcs are used (e.g., one- and two-way roads), and the arcs have associated weights that represent, for example, the distance or travel time of the road being represented.

Network Graph Digraph
Node Vertex Vertex
Arc (undirected/directed) Edge (undirected) Edge (directed)

In Julia, both graphs and digraphs can be created. If a network with a mix of directed and undirected arcs is needed, then a digraph can be used, with two directed arcs in opposite directions used to represent an undirected arc.

There are different network representations that make it easier to operate on the network for specific applications, and it is usually possible to translate from one representation to another. Given a network with $m$ nodes and $n$ arcs:

Ex 1: Digraph (complete bipartite digraph)

image.png

Interlevel to Adjacency: Embed the $m \times n$ interlevel matrix C into a $(m + n) \times (m + n)$ adjacency matrix:

The function SimpleWeightedDiGraph is used to create a directed network using an adjacency matrix, where each element ( i, j ) in the matrix represents an arc directed from a starting node ( i ) to an ending node ( j ):

Since the adjacency matrix is stored as a sparse matrix, covert it to a regular (dense) matrix for display purposes:

An array comprehension can be used to iterate over all of the edges in the graph:

Problem: Arc 2 => 4 of 0 weight not included

Adjacency to Arc list: Iterate over the edge and weight data in the graph to create the arc lists:

Adjacency/Arc list to Incidence matrix: Output the incidence matrix from the graph, create either from the adjacency matrix or arc lists:

Since the incidence matrix representation does not include arc weight information, the weights need to be determined separately:

Ex 2: (Undirected) Graph:

image.png

The only change is that each arc in the network is undirected from the starting node ( i ) to the ending node ( j ). The function SimpleWeightedGraph is used to create an undirected network, as opposed to using SimpleWeightedDiGraph to create a directed network:

2. Dijkstra's Shortest Path Procedure

Dijkstra's procedure is used to find the shortest path in a network. It is a greedy procedure that can quickly find the (global) optimal solution.

Ex 3: Undirected Network

Determine the shortest path from node 1 to node 6 in this network:

image.png

image.png

image.png

image.png

Pred: $\quad 0 - 3 - 1 - 2 - 4 - 5 $

Path: $\quad 1 \leftarrow 3 \leftarrow 2 \leftarrow 4 \leftarrow 5 \leftarrow 6 $

Cost: $\quad 13$

The network can be created by defining the starting and ending nodes for each arc along with each arc's weight:

Since the location of each node in the network is not specified, the function gplot can be used to try to find a "good" set of node locations. The procedure used starts with a random set of locations for the nodes and then tries to improve it. As a result, it is good to try a variety of different random seed values to find a good layout of the network.

Using Dijkstra's procedure from LightGraphs (very efficient compared to the code example that follows):

To show an implementation of Dijkstra's procedure, we will operate on the adjacency matrix representation of the network. Only inputs are the adjacency matrix and the starting node.

Dijkstra's Shortest Path Procedure:

Ex 4: Directed Network

The only change is that each arc in the network is directed from the starting node ( i ) to the ending node ( j ). The function SimpleWeightedDiGraph is used to create a directed network, as opposed to using SimpleWeightedGraph to create an undirected network:

Ex 5: Road Distance from Raleigh to Portland

The files _NetMod-2-Data-node.csv and _NetMod-2-Data-arc.csv contain 862-node and 1223-arc data, respectively, for the U.S. Interstate road network. Each node is defined by its longitude ( x ) and latitude ( y ), and each arc is defined by its starting node index ( i ), ending node index ( j ), and its distance ( d ) in miles. The network is undirected because all of the arcs can be used for two-way travel.

image.png

The network can be used to find the shortest distance form Raleigh, NC (node 565) to Portland, OR (node 629):

Using Google Maps, the distance of the shortest travel time between Raleigh and Portland was found to be 2,870 miles, as compared to 2,937. (One reason for the difference could be new Interstate segments that are in Google and are not in the older road data in _NetMod-2-Data.xlsx.)

image.png

Since the location of each node in the network is specified, the function plot can be used to display the network instead of using gplot:

Ex 6: Scheduling with Alternate Routings

In this example, eight different machines are available for the production of a product. There is some flexibility in the set of machines that can be used to perform all of the operations needed to produce the product, and so the product can be produced with several alternate machine routings. Machine 1 is required to be used for the first set of operations and will require two hours to complete (due to processing time and estimated waiting time at the machine). After Machine 1, either Machine 2 or Machine 3 can be used for the next operations, each machine requiring different times because both the number and type of operations are different and the waiting times are different. For each machine, the following table lists its predecessor(s) and time. In the case of more than a single predecessor, either machine listed is a feasible predecessor in a routing.

Machine │ Predecessor   Time
────────┼────────────────────
      1 │ 0                2
      2 │ 1                3
      3 │ 1                5
      4 │ 2                4
      5 │ 2                1
      6 │ [3, 5]           4
      7 │ 4                3
      8 │ [6, 7]           2

A directed network can be used to indicate the predecessor relationships, where the weight of each arc is the time associated with the arc's starting (or outgoing) node:

Machine 8 is the last machine in the network, and as a result, its time has not been used as the weight on an outgoing arc from node 8. An additional node (9) can be added to the network to serve as the ending node for an arc from node 8 with a weight equal to Machine 8's time:

Displaying and then determining the shortest path from node 1 to 9 results in a routing (or path) that visits machines 1, 2, 5, 6, and 8 at a cost of 12 time units:

3. Critical Path Method (CPM)

The critical path method (CPM) is a procedure for scheduling the activities of a project. Unlike scheduling with alternate routings (Ex 4), all of the predecessors of a node must be complete before the activity represented by the node can occur. As a result, the minimum time to complete all of the activities represented in a network corresponds to the longest path through the network. The activities along this path are the critical activities that, if delayed, would delay the entire project. The longest path can be determined by finding the shortest path through the network using the negative of all arc times.

Ex 7: House Construction Time

The table created below shows the activities associated with the construction of a house. Listed for each activity are the activities that must immediately precede the start of the activity and the time required to complete the activity (in working days). Estimate how many days will be required to construct the house.

Activity 1 is the single initial activity because it is the only activity without any predecessors. Activities 13 and 14 both represent final activities because they are not the predecessors of any other activities. A directed network can be used to indicate the predecessor relationships, where the weight of each arc is the time associated with the arc's starting (or outgoing) node:

Nodes 13 and 14 represent final activities, and, as a result, their times have not been used as the weight on any outgoing arcs. An additional node (15) can be added to the network to serve as the ending node for arcs from nodes 13 and 14 with weights equal to their activity times:

Creating the directed graph, displaying it, and then negativing the arc times so that the longest path from node 1 to 15 can be determined results in a critical path of activities in the construction of the house that will require a minimum of 44 working days to complete:

Issue: Multiple Initial Activities

When there is a single initial activity, it can be used as the first node of the critical path. If there are multiple initial activities, a new single dummy initial activity can be used as the predecessor for each, and the arc connecting the dummy node to the real initial nodes can have a time of 0. For example, if an additional initial activity 15 (Curb cut) has no predecessor, takes one day, and is the predecessor of activity 2, dummy activity 16 can be added as its predecessor and the predecessor of activity 1:

Determine Critical Path
Determine Earliest Start-Finish and Latest Start-Finish for Activities

A forward breadth-first search starting with the initial node can be used to determine the earliest start time (es) for each activity, which, after adding the time of each activity, gives the earliest finish time (ef). Similarly, a backward breadth-first search starting with the final node can be used to determine the latest finish time (lf) for each activity, which, after subtracting the time of each activity, gives the latest start time (ls).

Activities that are on the critical path can be identified by having identical earliest and latest start times. Any positive difference between the latest and earliest start time of an activity represents the slack available in starting the activity.

Ex 8: Another CPM Example

Activity 1 requires three days to complete and can start after activities 2, 3, and 4 are complete. Activity 2 requires four days and can begin after activity 5, and activity 3 takes nine days and can begin after both activities 5 and 6 are complete. Activities 4, 5, 6, and 7 each take seven, 12, six, and four days and each can begin after activities 7, 8, 9, and 9 are complete, respectively, and activities 8 and 9 each take ten and eight days to complete, respectively, and can begin at any time. Determine the minimum number of days required to complete all of the activities.

Data Input
Add Inital and Final Activities
Determine Critical Path
Report Activity Slack

4. Discussion

Dijkstra

The Dijkstra procedure requires that arcs have all nonnegative lengths on any cycles in the network:

Floyd-Warshall

Networks with only some negative arcs on a cycle require slower “label correcting” procedures that repeatedly check for optimality at all nodes or detect a negative cycle.

A* algorithm

A* algorithm adds to Dijkstra a heuristic LB estimate of each node's remaining distance to the destination:

image.png

Time Complexity of Shortest Path Procedures

Procedure Problem Time Complexity
Simplex LP $O(2^m)$
Ellipsoid LP $O(m^4)$
Hungarian Transportation $O(m^3)$
Floyd-Warshall Shortest Path with Cycles $O(m^3)$
Dijkstra (linear min) Shortest Path without Cycles $O(m^2)$
Dijkstra (Fibonocci heap) Shortest Path without Cycles $O(n\,\mbox{log}\,m)$
Number of nodes $m$
Number of arcs $n$

The following compares the time complexity of determining the road distance from Raleigh, NC to Portland, OR: