Introduction
Every GIS-professional has probably heard of networks (e.g. streets or rivers) and the widely applied shortest path algorithm which finds the cheapest (time, length or whatever you specify as „cost“) route from point A to B. Converting line features into some network data structure and performing shortest path calculations is a common task in the geospatial world. But in the past few months I found out that there is a lot more than this to discover!
If you want to do a little more than finding shortest routes, e.g. minimizing the overall shipping costs from several factories to several stores and satisfying both supply and demand you would not find any tool in your everyday GIS toolbox that allows you to set up such a model. Need it though? Or just interested how to solve this problem? Well read on…
Theory: Network Flows
First you’ll need a little bit of background information about network flows. We’ll leave the regular world of GIS now for a while and dive into the world of graphs and – beware! – mathemathics. Afraid? Common, it’s not too tricky..
The example mentioned in the introduction is called „Transportation Problem“ (TP). This simple form of a network flow model was first described by HITCHCOCK (1941).
There are supply nodes (or sources, e.g. factories) and demand nodes (or sinks, e.g. stores) which are connected through arcs. All nodes carry data about how much flow (e.g. goods) shall be shipped resp. received. Supply is usually noted with positive numbers, demand with negatives.
Each arc is defined through its connecting nodes (e.g. Arc1: {S1,D1}, Arc2: {S1,D2} etc.). Additionally each arc has a cost attribute (e.g. km). In the simple instance on the right the cost for traversing each arc is noted in brackets (e.g Arc1: {S1,D1} : Cost 3 ). Note that the cost information is per flow, i.e. shipping one unit of goods from S1 over Arc1 to D1 costs 3 units. More important is the concept of flow at the nodes level. In Reality you ship packets of goods and not single goods. Theses packets are transfered as one flow and the shipping of one packet will cost the given cost on the certain arc.
An example from forestry will clarify the concept of flow. There is 150 cubic meters of wood at Node S1 that must be shipped. But you wouldn’t call a transportation company to ship each single cubic meter out of supply node S1. You would pack them together in packets of 30m³, because this is the maximum load for one truck. As a result the amount of flow is not 150m³ but 5 (150/30). Of course the same procedure must be considered on the demand side. Node D1 has a demand of 7 flow units, which corresponds to 7*30m³ = 210m³.
„Well“ you could say, „I know all that. Except for this flow concept – I’m familiar with arcs ’n‘ nodes and cost attributes. I’m a GIS professional! So what?“
OK, now it’s time for some maths. A network flow model can be expressed as a set of mathematical equations. Often you can formulate a network flow model as a Linear program (LP). The Transportation Problem can be stated as seen on the left, which can be read as:
„Minimize the total cost of flow (x) over the arcs (ij) holding the costs (c) and respect the following:
-> flow (x) is always lower or equal the supply (s) of each supply point
-> flow (x) is always greater or equal the demand (d) of each demand point
-> flow (x) must be greater or equal zero“
The network graph of the simple instance of the TP can also be written as a matrix. Using that matrix you‘ ll understand the formulations of the sum equations better. If you write down the first sum equation for our TP instance you’ll get the following:
min = 3x11 + 1x12 + 4x21 + 2x22 + 4x23 + 3x32 + 3x33
x11 + x12 = 5
x21 + x22 = 5
x23 + x32 + x33 = 3
x11 + x21 = 7
x12 + x22 + x32 = 3
x23 + x33 = 5
xij >= 0
Now go and find the best value for x and the minimum equation and respect all other equations. This is the task of a linear program solver. In this peticular case the optimum value for the Transportation Problem is 46 (min = 46). This is the minimized overall shipping cost satisfying all demands with the supply. The value of x (flow) will vary on each arc. Of course a solver package is able to return both the minimized optimum and the amount of flow through each arc.
So this is a Linear Program which cannot be solved in a regular GIS package. There are specialized Solvers in GIS packages (e.g. the TSP-Solver in GRASS GIS or various solvers of Network Analyst in ESRI ArcGIS) but they are not able to solve a plain Linear Program such as the Transportation Problem. So you need another software package. But you’re lucky – there are many solvers available for linear programs – both commercial or free ones (list).
Now you might have an idea of what linear programming in context of networks basically is. The next challenge is to get the mathematical equations of a model into some linear solver package.