Artikel getaggt mit ‘Network Flows’

September 24th, 2013

pyMCFsimplex – Efficient Solving of Minimum Cost Flow Problems in Python

pyMCFsimplex – a Python Wrapper for MCFSimplex

pyMCFimplex is a Python-Wrapper for the C++ MCFSimplex Solver Class from the Operations Research Group at the University of Pisa. MCFSimplex is a piece of software hat solves big sized Minimum Cost Flow Problems very fast through the (primal or dual) network simplex algorithm. Also important: MCFSimplex is open source software – it has been released under the LGPL license.

To get a feeling for the performance of MCFSimplex: it solves a MCFP of 50.000 nodes and 500.000 arcs in 5 seconds on a average linux box (2,2Ghz 6GB RAM, Ubuntu 64Bit). Another one: NetworkX (a pure python graph library) also has an implementation of the network simplex algorithm – but this one is magnitudes slower than e.g. MCFSimplex. I tried so solve the same MCFP (n = 50.000, m = 500.000) with NetworkX and terminated the process after 1 hour. I intend to publish more on performance tests in an upcoming article.
See also this paper for a MCFP benchmarks.

You may ask: So what about pyMCFsimplex? Well, you’ll get the performance of a well coded C++ MCFP solver with the simplicity of a Python API. Isn’t this nice?

weiterlesen »

2 Leute mögen diesen Artikel.

April 10th, 2013

Solving the Minimum Cost Flow Problem (6) – Google or-tools

Push-relabel algorithm: Google or-tools

At last we will solve the instance of a Minimum cost flow problem described in (1) with Google or-tools. Node 1 is the source node, nodes 2 and 3 are the transshipment nodes and node 4 is the sink node.


So let’s see how the python API of google or-tools work:

# import graph from google or-tools
from graph import pywrapgraph

# Create graph
# Attention: add +1 for number of nodes and arcs, if your node-IDs begin with 1
# because google or-tools mcf solver internal counters are strictly zero-based!
num_nodes = 4 + 1
num_arcs = 5 + 1
# args: NumNodes * NumArcs
G = pywrapgraph.StarGraph(num_nodes, num_arcs)

# create min cost flow solver 
min_cost_flow = pywrapgraph.MinCostFlow(G)

# add node to graph with positive supply for each supply node 
min_cost_flow.SetNodeSupply(1, 4)
# add node to graph with negative demand for each demand node 
min_cost_flow.SetNodeSupply(4, -4)
# you can ignore transshipment nodes with zero supply when you are working with
# the mcfp-solver of google or-tools

# add arcs to the graph
arc = G.AddArc(1, 2)
min_cost_flow.SetArcCapacity(arc, 4)
min_cost_flow.SetArcUnitCost(arc, 2)
arc = G.AddArc(1, 3)
min_cost_flow.SetArcCapacity(arc, 2)
min_cost_flow.SetArcUnitCost(arc, 2)
arc = G.AddArc(2, 3)
min_cost_flow.SetArcCapacity(arc, 2)
min_cost_flow.SetArcUnitCost(arc, 1)
arc = G.AddArc(2, 4)
min_cost_flow.SetArcCapacity(arc, 3)
min_cost_flow.SetArcUnitCost(arc, 3)
arc = G.AddArc(3, 4)
min_cost_flow.SetArcCapacity(arc, 5)
min_cost_flow.SetArcUnitCost(arc, 1)

# solve the min cost flow problem
# flowDict contains the optimized flow
# flowCost contains the total minimized optimum
flowCost = min_cost_flow.GetOptimalCost()

print "Optimum: %s" %flowCost

As one can see the python API of google or-tools is also pretty handy although everything is coded in C++. The python API was created using SWIG. The biggest caveat one can run into is the fact that the MCFP solver in google or-tools is strictly zero based. So adding nodes starting from 1 ..n result in strange errors if you don’t size the initial graph properly: just add +1 to the number of nodes and the number of arcs if your node ids start with 1.

Alternatively you can pass the proper number of nodes and arcs at the graph initialization and substract -1 from all of your node ids so they are zero based. But be careful – any arc is made of the node ids so you’ll have to substract -1 at the arcs level also.

Be the first to like.

April 10th, 2013

Solving the Minimum Cost Flow problem (5) – NetworkX

Network Simplex Solver: NetworkX

We will solve the instance of a Minimum cost flow problem described in (1) with NetworkX. Node 1 is the source node, nodes 2 and 3 are the transshipment nodes and node 4 is the sink node.


Solving a Minimum Cost Flow Problem with NetworkX is pretty straight forward.

# import networkx
import networkx as nx

# create directed graph
G = nx.DiGraph()

# add node to graph with negative (!) supply for each supply node 
G.add_node(1, demand = -4)

# you can ignore transshipment nodes with zero supply when you are working with the mcfp-solver 
# of networkx
# add node to graph with positive (!) demand for each demand node
G.add_node(4, demand = 4)

# add arcs to the graph: fromNode,toNode,capacity,cost (=weight)
G.add_edge(1, 2, capacity = 4, weight = 2)
G.add_edge(1, 3, capacity = 2, weight = 2)
G.add_edge(2, 3, capacity = 2, weight = 1)
G.add_edge(2, 4, capacity = 3, weight = 3)
G.add_edge(3, 4, capacity = 5, weight = 1)

# solve the min cost flow problem
# flowDict contains the optimized flow
# flowCost contains the total minimized optimum
flowCost, flowDict = nx.network_simplex(G)

print "Optimum: %s" %flowCost  #should be 14

As you can see the code is very readable and easy to understand. The only thing you could get trouble with in the formulation of a min cost flow problem with networkx is the fact that the supply nodes get a negative supply value (because the python attribute is called „demand“) while the demand nodes require positive demand values.
This differs from the usual mathematical notation of supply and demand values in the network flow problems where supply values are positive and demand values of negative sign.
Also note that you only need to add supply and demand nodes to the graph. You don’t have to care of the transshipment nodes.

4 Leute mögen diesen Artikel.

April 6th, 2013

Solving the Minimum Cost Flow problem (4) – PuLP

Linear program solvers: PuLP

We will solve the instance of a Minimum cost flow problem described in (1) now with another linear program solver: PuLP. Node 1 is the source node, nodes 2 and 3 are the transshipment nodes and node 4 is the sink node.


While lpsolve has this nice feature of reading DIMACS network flow problems, PuLP has nothing comparable to offer. So we have to transform the whole network flow problem into a plain linear program on our own.

At the PuLP wiki I found a sample for solving the transshipment problem with PuLP. Actually these code snippets work also for the minimum cost flow problem but in the original code there is a typo:


for n in Nodes:
    prob += (supply[n]+ lpSum([route_vars[(i,j)] for (i,j) in Arcs if j == n]) >=
             demand[n]+ lpSum([route_vars[(i,j)] for (i,j) in Arcs if i == n])), \
            "Steel Flow Conservation in Node:%s"%n


for n in Nodes:
    prob += (supply[n]+ lpSum([route_vars[(i,j)] for (i,j) in Arcs if j == n]) >=
             demand[n]+ lpSum([route_vars[(i,j)] for (i,j) in Arcs if i == n])), \
            "Steel Flow Conservation in Node %s"%n

The problem in the original code is the colon (‚:‘) at 'Steel Flow Conservation in Node:%s"%n'. This causes the generation of an incorrect *.lp file that is generated when you use PuLP. This *.lp file will be sent to a solver package of your (or PuLP’s) choice. So just remove the colon in this line.

Here we go with the full working code for our sample instance of the MCFP:

Minimum Cost Flow Problem Solver with PuLP
Adapted from:
The American Steel Problem for the PuLP Modeller
Authors: Antony Phillips, Dr Stuart Mitchell   2007


# Import PuLP modeller functions
from pulp import *

# list of nodes
nodes = [1,2,3,4]

# supply or demand of nodes
            #NodeID : [Supply,Demand]
nodeData = {1:[4,0],

# arcs list
arcs = [ (1,2),

# arcs cost, lower bound and capacity
            #Arc : [Cost,MinFlow,MaxFlow]
arcData = { (1,2):[2,0,4],
            (3,4):[1,0,5] }

# Splits the dictionaries to be more understandable
(supply, demand) = splitDict(nodeData)
(costs, mins, maxs) = splitDict(arcData)

# Creates the boundless Variables as Integers
vars = LpVariable.dicts("Route",arcs,None,None,LpInteger)

# Creates the upper and lower bounds on the variables
for a in arcs:
    vars[a].bounds(mins[a], maxs[a])

# Creates the 'prob' variable to contain the problem data    
prob = LpProblem("Minimum Cost Flow Problem Sample",LpMinimize)

# Creates the objective function
prob += lpSum([vars[a]* costs[a] for a in arcs]), "Total Cost of Transport"

# Creates all problem constraints - this ensures the amount going into each node is 
# at least equal to the amount leaving
for n in nodes:
    prob += (supply[n]+ lpSum([vars[(i,j)] for (i,j) in arcs if j == n]) >=
             demand[n]+ lpSum([vars[(i,j)] for (i,j) in arcs if i == n])), \
            "Flow Conservation in Node %s"%n

# The problem data is written to an .lp file

# The problem is solved using PuLP's choice of Solver

# The optimised objective function value is printed to the screen    
print "Total Cost of Transportation = ", value(prob.objective)

The code is well commented. For a very detailed description have a look at the wiki of PuLP. Because there is no special network component in the PuLP modeler – any problem to solve is a linear program. So you have to write any instance of a MCFP as a plain LP which is not easy to understand at first glance.

3 Leute mögen diesen Artikel.

April 6th, 2013

Solving the Minimum Cost Flow problem (3) – lpsolve

Linear program solvers: Lpsolve

In the following articles we will solve the instance of a Minimum cost flow problem described in (1). Node 1 is the source node, nodes 2 and 3 are the transshipment nodes and node 4 is the sink node.


As lpsolve can read the DIMACS file format we don’t have to transform a graph structure into a plain linear program. The XLI (eXtensible language interface) does this for us.
A good description of the DIMACS Minimum cost flow problem file format can be found here.
So for solving our MCFP we just have to write down the MCFP in DIMACS file format with a text editor of our choice (I called the file ’‘) and let lpsolve do the dirty work:

c Problem line (nodes, links)
p min 4 5
c Node descriptor lines (supply+ or demand-)
n 1 4
n 4 -4
c Arc descriptor lines (from, to, minflow, maxflow, cost)
a 1 2 0 4 2
a 1 3 0 2 2
a 2 3 0 2 1
a 2 4 0 3 3
a 3 4 0 5 1

Here is the python code for solving a minimum cost flow problem in DIMACS file format with the python API of lpsolve:

from lpsolve55 import *

# full file path to DIMACS file
path = ''

# read DIMACS file and create linear program
lp = lpsolve('read_XLI', 'xli_DIMACS', path)

# solve linear program
lpsolve('solve', lp)

# get total optimum
optimum = lpsolve('get_objective', lp)

print optimum #should be 14

When you are working with the python API of lpsolve you’ll always have to call the function lpsolve() and pass the C method you want to access as first argument as string, e.g. lpsolve(’solve‘, lp). So obviously this API is a very thin layer upon the C library underneath.

The usage of the python API of lpsolve is not really pythonic because lpsolve is written in C – but hey: it works!  The complete list of functions can be found here.

Be the first to like.

März 31st, 2013

Solving the Minimum Cost Flow Problem (2) – Software

I don’t want to give you a complete overview of MCFP solvers because I just dipped into the world of linear and network flow programming. Instead of this I just want to mention a few MCFP solvers that I have tested. Python will be the programming language for the test – so I checked solvers with the corresponding language bindings.

Linear program solvers

As mentioned in the previous article one can solve MCFP also as ordinary linear programs. So here are the two packages I tested.


lpsolve is a Mixed Integer Linear Programming (MILP) solver written in C released under the LGPL. You can use lpsolve on Windows and Linux systems. For windows there is a nice GUI which is called LPSolveIDE. It reads several LP file formats such as *.mpl, *.lp and also the DIMACS file format which is used for network flow problems. Furthermore it has a rich API: you can use lpsolve with .NET (there is even a lpsolve plugin for Microsofts Solver Foundation MSF!), Java, PHP and Python.


PuLP is kind of a python wrapper for various (LP) solver packages released under the MIT license:

 PuLP is an LP modeler written in python. PuLP can generate MPS or LP files and call GLPK (, COIN (, CPLEX (, and GUROBI ( to solve linear problems.


Specialized MCFP solvers

As the MCFP is a special linear program that can be solved more efficiently with algorithms that exploit the network structure you should also have a look at specialized solvers around.


NetworkX is a pure Python library that deals with graphs:

NetworkX is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks (

The API is not very hard to learn, it’s well documented and obviously there is a active user community. NetworkX is released under the BSD license. It offers algorithms for many graph related issues (complete list here):

  • Clustering
  • Shortest Paths (Dijkstra, A*, Bellman-Ford, Floyd-Warshall)
  • Network flows (Minimum cost flow, maximum cost flow, minimum cut, Maximum flow minimum cost)
  • Minimum spanning tree


Google or-tools are a set of tools that deal not only with graph structures and algorithms but also with various other issues related to Operations Research (OR). This package is released under the Apache 2.0 license:

This project hosts operations research tools developed at Google and made available as open source. The project contains several tools:

  • A Constraint Programming solver.
  • A wrapper around third-party linear solvers (GLPK, CLP, CBC, SCIP, Sulum).
  • Knapsack algorithms.
  • Graph algorithms (shortest paths, min-cost flow, max flow, linear sum assignment).

Everything is coded in C++ and is available through SWIG in Python, Java, and .NET (using Mono on non-Windows platforms) (

Ok that’s it for now – let’s dive into the API of the packages in the next articles..

1 andere Person mag diesen Artikel.

März 24th, 2013

Solving the Minimum Cost Flow Problem (1) – Intro

Minimum cost flow problem: a short description

The Minimum cost flow problem (MCFP) is a generalized network flow problem that describes the goal of minimizing the total cost of flow through a network made of arcs and nodes which contains source, sink and transshipment nodes.

The minimum-cost flow problem is to find the cheapest possible way of sending a certain amount of flow through a flow network. Solving this problem is useful for real-life situations involving networks with costs associated (e.g. telecommunications networks), as well as in other situations where the analogy is not so obvious, such as where to locate warehouses (

Source nodes emit flow while sink nodes absorb flow. The flow through transshipment nodes is only intermediate: inflow must be equal to outflow at these nodes: so the net flow at these nodes is zero. Additionally there are maximum flow capacities on each arc of the network, i.e. it is not allowed that the amount of flow through a certain arc exceeds the maximum capacity specified on this arc.

In the following picture node 1 has 4 units of supply (= source node with positive value), node 4 has 4 units of demand (sink node with negative value) while nodes 2 and 3 are only transshipment nodes with a supply of zero. On the arcs you find lower flow bounds, upper flow bounds (or capacities) and costs. The goal is to ship 4 units from node 1 to node 4 with the lowest costs and without violation of the flow constraints (lower and upper bounds on the arcs).

Instance of a Minimum Cost Flow Problem

Instance of a Minimum Cost Flow Problem (

The Minimum cost flow problem can be seen as a general problem from which various other network flow problems can be derived:

  • Transportation problem
  • Transshipment problem
  • Assignment problem
  • Shortest path (tree) problem

A special case of the Minimum Cost Flow Problem is the famous transportation problem (TP). In a regular TP the network lacks transshipment nodes and capacities on the arcs. So it’s a simpler version of the MCFP. The transshipment problem comes with transshipment nodes but lacks capacities on the arcs of the network. The assignment problem is a special version of the transportation problem and you can even solve a shortest path (tree) problem as a MCFP. Read more on the theoretical background here.

So in short: if you have a MCFP-Solver at hand you are able to solve various problems with this package.

The assignment problem is a special case of the transportation problem, which is a special case of the minimum cost flow problem, which in turn is a special case of a linear program. While it is possible to solve any of these problems using the simplex algorithm, each specialization has more efficient algorithms designed to take advantage of its special structure (

As stated in this quote and in previous articles, you can describe and solve a MCFP as a Linear program with an ordinary LP solver package. The simplex algorithm will likely be applied in this case. Or you can speed up things (magnitudes faster!) with a solver that is specialized on network flow problems and exploits the network structure of the MCFP: a network simplex solver or the push-relabel algorithm.

More on this in the next articles..

2 Leute mögen diesen Artikel.

Oktober 24th, 2012

Solving the Transportation Problem (3) – MSF API in .NET

Solver Foundation in C#. Net

In order to solve the above described instance of a Transporation problem we only need a C# Console Application with a few lines of code. So start your IDE (SharpDevelop or Visual Studio [Express Edition will work either]) and create a new console application project. Then add a reference to the Microsoft.Solver.Foundation.dll under „References“. This DLL should be found after the successful installation of MSF in „C:\Program Files\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\“. As we are using Data (DataTables, DataSets) you also need to reference „System.Data“.

The agenda for the current article is the following:

  • load the OML model into Solver Foundation Context
  • retrieve data from our sample Access Database
  • bind data to the model
  • solve the problem
  • print out minimized total result and the amount of flow of each arc

So here we go with the first section of our class.

using System;
using System.IO;
using System.Data;
using System.Data.OleDb;
using Microsoft.SolverFoundation.Services;

namespace GSharpBlog.Optimization.TP_Example 
	class Program
        /// Gets results of a SQL query through a OLE-DB connection.
        private static DataSet getDataFromDB(string connectionStr, string query)
            DataSet ds = new DataSet();
            OleDbDataAdapter adapter = new OleDbDataAdapter();
            OleDbConnection conn = new OleDbConnection(connectionStr);
            try {
                adapter.SelectCommand = new OleDbCommand(query, conn);
                return ds;
            catch (Exception ex){ throw ex; }
            finally{ conn.Close(); }

First we state a helper method for retrieving data from our Access Database, that simply takes the Connection String and the SQL query and returns a DataSet. After that we can go on with the automatically generated Main() method in which all code is written for the sake of simplicity.
In the following lines of code we define the above stated OML model as string and read it into the Solver Foundation Context. Then the structure of the supply, demand and arcs table (or OD-Matrix) are defined as DataTables.

            // Access (2007) DB ConnectionString
            string connectionStr = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\tp.accdb";

			// Transportation Problem as OML model
			string strModel = @"Model[
					TotalCost == Sum[{i,Source},{j,Sink},Cost[i,j]*flow[i,j]],
                    Foreach[{i,Source}, Sum[{j,Sink},flow[i,j]]<=Supply[i]],
					Foreach[{i,Source}, Sum[{j,Sink},flow[i,j]]>=Demand[j]]],
				Goals[Minimize[TotalCost]] ]";

			// Load OML-Model
			SolverContext context = SolverContext.GetContext();
			context.LoadModel(FileFormat.OML, new StringReader(strModel));
			context.CurrentModel.Name = "Transportation Problem";

			// Create Tables
			// Supply table
			DataTable pSupply = new DataTable();

			// Demand table
			DataTable pDemand = new DataTable();

			// OD-Matrix
			DataTable pCost = new DataTable();

Then data is pulled out of our Access Database and loaded into the DataTables. Now we can use the helper method getDataFromDB().

            //// Fill tables
            // 1. Fill Supply
            string query = String.Empty;
            DataSet accessDS = new DataSet();
            query = "SELECT SupplyNode, Supply FROM Supply ORDER BY SupplyNode";
            accessDS = getDataFromDB(connectionStr, query);

            foreach (DataRow row in accessDS.Tables[0].Rows) {
                pSupply.Rows.Add(row[0].ToString(), row[1]);

            // Clear
            query = String.Empty;

            // 2.Fill Demand
            query = "SELECT DemandNode, Demand FROM Demand ORDER BY DemandNode";
            accessDS = getDataFromDB(connectionStr, query);

            foreach (DataRow row in accessDS.Tables[0].Rows) {
                pDemand.Rows.Add(row[0].ToString(), row[1]);

            // Clear
            query = String.Empty;

            // 3. Fill Arcs (or OD-Matrix)
            query = "SELECT SupplyNode, DemandNode, Cost FROM Arcs ORDER BY SupplyNode";
            accessDS = getDataFromDB(connectionStr, query);

            foreach (DataRow row in accessDS.Tables[0].Rows) {
                pCost.Rows.Add(row[0].ToString(), row[1].ToString(), row[2]);

Now we bind the data to the OML model that we have defined earlier. Microsoft Solver Foundation has an advanced binding mechanism that is pretty straight forward. With the following lines, we bind the data from our tables to the OML model: loop through the parameters in the model and simply call the method SetBinding([DataTable].AsEnumerable(), [ValueColumnName], [IDColumnName]) for each parameter.

			// Bind values from tables to parameter of the OML model
			foreach (Parameter p in context.CurrentModel.Parameters)
				switch (p.Name)
						case "Supply":
							p.SetBinding(pSupply.AsEnumerable(), "Supply", "SupplyNode");
						case "Demand":
							p.SetBinding(pDemand.AsEnumerable(), "Demand", "DemandNode");
						case "Cost":
                            p.SetBinding(pCost.AsEnumerable(), "Cost", "SupplyNode", "DemandNode");

Then we just have to call the Solve() method and fetch the result. Now we loop through the decisions of the model, look for the decision „TotalCost“ and fetch the value of the optimized result.

			// Call solver
			solution = context.Solve();

			// Fetch results: minimized total costs
			Report report = solution.GetReport();	
			double cost = 0;
			foreach (Decision desc in solution.Decisions) {
				if (desc.Name == "TotalCost")
						foreach (object[] value in desc.GetValues()) {
							cost = Convert.ToDouble(value[0]);

Because we want to know not only the optimized total costs but also the amount of flow between each supply node and demand node we have to query the solution more detailed. We get information about the source/sink-pairs about the amount of flow from MSF – but we lack the distance (or cost) between each pair. So we need to lookup the distances in the original arcs table.

// Print out optimized results
string result = String.Empty;
double totalFlow = 0.0;
foreach (Decision desc in solution.Decisions) {
	// flow as variable
	if (desc.Name == "flow")
		foreach (object[] value in desc.GetValues()) {
			string source = value[1].ToString();
			string sink =  value[2].ToString();
			double flow = Convert.ToDouble(value[0]);

			// lookup km from arcs table
			DataRow[] rows = new DataRow[1];
			rows = pCost.Select("SupplyNode ='"+source+"' AND DemandNode ='"+sink+"'");
			double km = Convert.ToDouble(rows[0]["Cost"]);
			string sourceSink = String.Format("{0}_{1}", source, sink);
			if(flow != 0)
				totalFlow += flow;
				result = result + "\n" + String.Format("\"{0}\";\"{1}\";\"{2}\";{3};{4}",
													   sourceSink, source, sink, flow, km);

The optimized result is 46. Our solver application prints out the following results.

Total result: 46


Download as IDE-Project (SharpDevelop/Visual Studio 2010)

4 Leute mögen diesen Artikel.

Oktober 24th, 2012

Solving the Transportation Problem (2) – Microsoft Solver Foundation

Microsoft Solver Foundation

Solver packages either offer APIs for various programming languages or a mathematical modeling language. And there are also some industry standard file formats of linear programs like *.MPS or *.LP which carry both the mathematical model and the data while modeling languages such as AMPL or OML allow the formulation of the model only without data. Additionally solver packages supporting a modeling language provide a binding mechanism for model and data.

Microsoft offers a Optimization programming library which is called Microsoft Solver Foundation (MSF). This product fits perfectly into the .NET programming environment and offers various solvers (e.g. linear, non-linear and goal programming), a .NET API and its own modeling language: OML.

You are allowed to download a limited copy of MSF (Express Edition). The important limitations of this copy are:

  • restricted to 50’000 variables when you’re using the MIP-Solver
  • you are allowed to install MSF on one single client only for commercial and productive use

So for prototyping you can go with the Express Edition of MSF. For professional use you would have to license the Standard or Professional Version.

Furthermore Microsoft Solver Foundation offers an interface to third-party solvers. This means you can stay in this optimization framework with the advantage of its tight integration in the .NET environment using your preferred solver – prodividing it conforms to the interface of MSF. With Solver Foundation 3.1 the following third-party-solvers are supported: CPLEX (IBM), Xpress (FICO), Gurobi, LINDO, lp_solve (Open Source), MOSEK, Ziena and Frontline Systems.

And there is a Excel-Plugin for Solver Foundation which allows you to write your models in OML, feed them with data and solve them. This is useful for fast prototyping and for testing your models – although it lacks debugging features of models  – because your can focus on data and the mathematical model and you don’t have to worry about programming in .NET. Take a look at the paper of Erwin at Amsterdam Optimization for more info about MSF, OML and the Excel-Plugin of Solver Foundation.

Hands on!

Now we’ve had the theory of the Transportation problem (1) and we have got a software package that should solve such problems.

First we need data. The simple instance of the Transportation problem in the previous article can be represented in a simple Microsoft Access Database like the this:

With the following content of its tables:

Note, that in the values in the demand table are signed positive in contrast to the common modeling convention stated in the previous article.

Now we have to build the model in our solver environment. I’m not able to give you a comprehensive introduction into the syntax of OML. Have a look at the paper of Erwin at Amsterdam Optimization for further study here.

So the ingredients for the transportation solver are: data, a model and the solver. As we’ve got data in our Access Database, we’ll have a look at the model. Erwin has transformed the mathematical formulation of the transportation problem into a OML-model which looks like this:

  TotalCost == Sum[{i,Source},{j,Sink},Cost[i,j]*flow[i,j]],
  Foreach[{i,Source}, Sum[{j,Sink},flow[i,j]]<=Supply[i]],
  Foreach[{j,Sink}, Sum[{i,Source},flow[i,j]]>=Demand[j]]],,
 Goals[Minimize[TotalCost]] ]

In this case OML is relatively close to the mathematical formulation of the problem. Just a few explanations:

  • first you define the fixed data of the model (called „parameters“ in OML) and their value domains (e.g. Reals)
  • then you define the variables (i.e. their values can be changed in the optimizatin process, called „decisions“)
    in this case the flow between source and sink nodes shall be changed until the total cost is minimized
  • our constraints of the model are
    • the supply that is being shipped from each supply node must be less or equal the given supply at each suppl node
    • the demand that shall be fulfilled at each demand node must be greater or equal the given demand at each demand node
  • our goal is the shipping of all goods from supply to demand nodes while minimizing the total shipping costs

Now there is a speciality about OML. OML in MSF 3.1 cannot handle sparse data. But in real networks sparse data is very common. In the OML model above MSF will expect all possible combinations of source and sink nodes and will throw an error if not all combinations exist in the data. There are two possibilities to get around this problem:

  • change the data with pseudo entries so it is not sparse anymore
  • change the model

The simple way to get things work is to change the data. We have to add the missing combinations (S1;D3 and S3;D1) in the arcs table and put an arbitary high amount of cost in comparison to the rest of the data (e.g. 99999) so that the solver would never take these pseudo arcs into account.

Of course under real time conditions this would not always be the way to solve the problem of sparseness. A better but more complex solution would be to change the model.

Now as we have data and the model we’re talking to the solver in the next article.

3 Leute mögen diesen Artikel.

Juli 5th, 2012

Solving the Transportation Problem (1) – Theory


Every GIS-professional has probably heard of networks (e.g. streets or rivers) and the widely applied shortest path algorithm which finds the cheapest (time, length or whatever you specify as „cost“) route from point A to B. Converting line features into some network data structure and performing shortest path calculations is a common task in the geospatial world. But in the past few months I found out that there is a lot more than this to discover!

If you want to do a little more than finding shortest routes, e.g. minimizing the overall shipping costs from several factories to several stores and satisfying both supply and demand you would not find any tool in your everyday GIS toolbox that allows you to set up such a model. Need it though? Or just interested how to solve this problem? Well read on…

Theory: Network Flows

First you’ll need a little bit of background information about network flows. We’ll leave the regular world of GIS now for a while and dive into the world of graphs and – beware! – mathemathics. Afraid? Common, it’s not too tricky..

TP Example

A simple instance of a TP

The example mentioned in the introduction is called „Transportation Problem“ (TP). This simple form of a network flow model was first described by HITCHCOCK (1941).

There are supply nodes (or sources, e.g. factories) and demand nodes (or sinks, e.g. stores) which are connected through arcs. All nodes carry data about how much flow (e.g. goods) shall be shipped resp. received. Supply is usually noted with positive numbers, demand with negatives.

Each arc is defined through its connecting nodes (e.g. Arc1: {S1,D1}, Arc2: {S1,D2} etc.). Additionally each arc has a cost attribute (e.g. km). In the simple instance on the right the cost for traversing each arc is noted in brackets (e.g Arc1: {S1,D1} : Cost 3 ). Note that the cost information is per flow, i.e. shipping one unit of goods from S1 over Arc1 to D1 costs 3 units. More important is the concept of flow at the nodes level. In Reality you ship packets of goods and not single goods. Theses packets are transfered as one flow and the shipping of one packet will cost the given cost on the certain arc.

An example from forestry will clarify the concept of flow. There is 150 cubic meters of wood at Node S1 that must be shipped. But you wouldn’t call a transportation company to ship each single cubic meter out of supply node S1. You would pack them together in packets of 30m³, because this is the maximum load for one truck. As a result the amount of flow is not 150m³ but 5 (150/30). Of course the same procedure must be considered on the demand side. Node D1 has a demand of 7 flow units, which corresponds to 7*30m³ = 210m³.

„Well“ you could say, „I know all that. Except for this flow concept – I’m familiar with arcs ’n‘ nodes and cost attributes. I’m a GIS professional! So what?“

Transportation Problem as LP

OK, now it’s time for some maths. A network flow model can be expressed as a set of mathematical equations. Often you can formulate a network flow model as a Linear program (LP). The Transportation Problem can be stated as seen on the left, which can be read as:


„Minimize the total cost of flow (x) over the arcs (ij) holding the costs (c) and respect the following:

-> flow (x) is always lower or equal the supply (s) of each supply point
-> flow (x) is always greater or equal the demand (d) of each demand point
-> flow (x) must be greater or equal zero“




The network graph of the simple instance of the TP can also be written as a matrix. Using that matrix you‘ ll understand the formulations of the sum equations better. If you write down the first sum equation for our TP instance you’ll get the following:

min = 3x11 + 1x12 + 4x21 + 2x22 + 4x23 + 3x32 + 3x33

TP as Matrix

x11 + x12 = 5
x21 + x22 = 5
x23 + x32 + x33 = 3

x11 + x21 = 7
x12 + x22 + x32 = 3
x23 + x33 = 5

xij >= 0

Now go and find the best value for x and the minimum equation and respect all other equations. This is the task of a linear program solver. In this peticular case the optimum value for the Transportation Problem is 46 (min = 46). This is the minimized overall shipping cost satisfying all demands with the supply. The value of x (flow) will vary on each arc. Of course a solver package is able to return both the minimized optimum and the amount of flow through each arc.

So this is a Linear Program which cannot be solved in a regular GIS package. There are specialized Solvers in GIS packages (e.g. the TSP-Solver in GRASS GIS or various solvers of Network Analyst in ESRI ArcGIS) but they are not able to solve a plain Linear Program such as the Transportation Problem. So you need another software package. But you’re lucky – there are many solvers available for linear programs – both commercial or free ones (list).

Now you might have an idea of what linear programming in context of networks basically is. The next challenge is to get the mathematical equations of a model into some linear solver package.



2 Leute mögen diesen Artikel.