Quick fixes to stop ignoring your builds

A printout of a team dashboard on a team board

In software development, quick feedback is fundamental, and continuously building, testing and deploying is part of our standard toolkit.

But it isn’t enough to set up and run these processes; we need to pay attention to the results. If we can’t successfully build, test and deploy our software, then we have no idea whether we’re building it right, and we risk releasing broken components. It saddens me how often I see teams that have become completely inured to failures, and the result is a predictable decline in quality and drop in team effectiveness.

Here then are a few quick and easy techniques to start paying attention to these failures.

Continue reading “Quick fixes to stop ignoring your builds”

A taste of graph theory

Postcard of buildings by a river. Caption reads ‘Königsberg Schlossteich’

I’ve recently been working with graph databases, which give us a powerful idiom for modelling and reasoning with highly interconnected data. Before I share some of my experiences, I would like to set the scene with a basic introduction to graph theory.

Simple Graphs

Graph theory is a relatively young branch of mathematics, traced back to 1736 and the work of Leonhard Euler.

A simple graph is defined as a non-empty set of vertices, e.g. V = {1,2,3}, and a set of edges, each of which is a 2-member subset of the vertex set, e.g. E = {{1,2},{1,3},{2,3}}.

We can visualise a graph by drawing a diagram:

simple three-vertex graph drawn as a triangle

It’s tempting, particularly for the etymologically minded, to think of a graph as drawing. It’s certainly easier to think about graphs by visualising them, but drawings as representations of graphs, have the potential to be misleading, so we need to exercise caution.

For example, the same graph can be drawn like this:

simple three-vertex graph drawn with curved edges and four crossings

Notice how this diagram shows four crossings, but is in fact equal to the above graph, which shows none.

Because simple graphs are defined in terms of sets, we can note some key characteristics:

  • Each vertex only appears once. V = {1,1,2,3} is not a valid set because a=a.
  • An edge cannot join a vertex to itself. {1,1} is not a valid set, so it cannot be a valid element in the edge set.
  • There can only be one edge between two particular vertices. E = {{1,2},{1,2},{2,3},{1,3}} is not a valid set because {1,2} = {1,2}.
  • The edges have no direction. E = {{1,2},{2,1},{2,3},{1,3}} is not a valid set because {1,2} = {b,a}.

Useful variations

These restrictions are great for pure mathematics, but somewhat restrict our ability to model real-world situations. For this reason, a typical graph database relaxes the rules in various ways:


It introduces a notion of direction in the edges. This means we are now dealing with directed graphs or digraphs.

We can now model a graph V = {1,2,3}, E = {(1,2),(1,3),(2,3)}:

three-vertex digraph drawn as triangle


We can allow skeins: multiple edges between vertices. If we replace any edge of a graph with a skein, then we have a multigraph, and our edge set becomes a multiset, as it may contain duplicated elements.

Here we expand the basic graph V = {1,2}, E = {{1,2}} by replacing the edge {1,2} with a skein of three edges, giving V = {1,2}, E = {{1,2},{1,2},{1,2}}:

two-vertex simple graph drawn as line


two-vertex multigraph with three edges

We can also model loops by allowing single-member sets as elements of EV = {1}, E = {{1}}:

one-vertex loop


We can also allow directed multigraphs, also known as multidigraphs or quivers.

Degrees of vertices

The degree of a vertex, deg v, is the number of edges attached to it. In a simple graph deg v = |{e : e ∈ E, ve}|. Visually you can find the degree of a vertex by counting the edges that connect to it.

The indegree of a vertex, deg– v, is the number of edges leaving it, and the outdegree of a vertex, deg+ v, is the number of edges reaching it. If we model the directed edges as tuples, then deg v = |{e : e ∈ E, ∃v e = (v,y)}| and degv = |{e : e ∈ E, ∃x e = (x,v)}|. Visually you can find the indegree of a vertex by counting the arrows that leave it, and the outdegree by counting the arrows that point to it.


The power of graphs to model connected data arises when we start walking our graphs.

Mathematically, a walk is a sequence of vertices (v1,v… vn-1,vn) where each vertex vx is a member of the graph’s vertex set, and each pair of vertices (vx,vx+1) in the sequence is a member of the graph’s edge set.

Visually, a walk is found by placing a pencil on one of the dots on a graph diagram, and tracing along a line to another dot, then repeating the process.

When we come to look at graph databases, we will focus on ‘traversing’ them by walking along their edges.


As well as a diagram, we can represent a graph with an adjacency matrix. Consider the graph V = {1,2,3,4}, E = {{1,2},{1,3},{1,4},{2,3},{3,4}}. We can draw this graph like this:

four-vertex simple graph drawn as a square with one diagonal edge

We can also create an adjacency matrix where Aij is 1 if {i,j} ∈ E and 0 otherwise.

    ⎛0 1 1 1⎞
A = ⎜1 0 0 1⎟
    ⎜1 0 0 1⎟
    ⎝1 1 1 0⎠

The top row shows how many edges there are between 1 and each vertex: none to itself, one to 2, one to 3 and one to 4.

We can easily find the deg v by taking the sum of the corresponding row or column. We can see at a glance that the deg 1 = 3.

By multiplying an adjacency matrix by itself, we can find how many two-edge walks exist between any two vertices:

     ⎛0 1 1 1⎞   ⎛0 1 1 1⎞   ⎛3 1 1 2⎞
A² = ⎜1 0 0 1⎟ x ⎜1 0 0 1⎟ = ⎜1 2 2 1⎟
     ⎜1 0 0 1⎟   ⎜1 0 0 1⎟   ⎜1 2 2 1⎟
     ⎝1 1 1 0⎠   ⎝1 1 1 0⎠   ⎝2 1 1 3⎠

We can check that there are three two-edge walks between 1 and 1 {(1,2,1),(1,3,1),(1,4,1)}, one between 1 and 2 {(1,4,2)}, one between 1 and 3 {(1,4,3)} and two between 1 and 4 {(1,2,4),(1,3,4)}.

We can continue this trick for three-edge walks:

     ⎛0 1 1 1⎞   ⎛0 1 1 1⎞   ⎛0 1 1 1⎞   ⎛4 5 5 5⎞
A³ = ⎜1 0 0 1⎟ x ⎜1 0 0 1⎟ x ⎜1 0 0 1⎟ = ⎜5 2 2 5⎟
     ⎜1 0 0 1⎟   ⎜1 0 0 1⎟   ⎜1 0 0 1⎟   ⎜5 2 2 5⎟
     ⎝1 1 1 0⎠   ⎝1 1 1 0⎠   ⎝1 1 1 0⎠   ⎝5 5 5 4⎠

Again we can check that there are four three-edge walks between 1 and 1: {(1,2,4,1),(1,3,4,1),(1,4,2,1),(1,4,3,1)}, five between 1 and 2: {(1,2,1,2),(1,2,4,2),(1,3,1,2),(1,3,4,2),(1,4,1,2)}, five between 1 and 3: {(1,2,1,3),(1,2,4,3),(1,3,1,3),(1,3,4,3),(1,4,1,3)} and five between 1 and 4: {(1,2,1,4),(1,3,1,4),(1,4,1,4),(1,4,2,4),(1,4,3,4)}.

In general the matrix Aⁿ shows us how many n-edge walks there are between each pair of vertices.

We can perform the same trick for multigraphs and digraphs:

Here is the quiver V = {1,2,3,4}, = {(1,2),(1,3),(1,3),(1,4),(2,4),(2,4),(3,4),(4,4)}:

four-vertex quiver drawn roughly as a square

Here is its adjacency matrix:

    ⎛0 1 2 1⎞
A = ⎜0 0 0 2⎟
    ⎜0 0 0 1⎟
    ⎝0 0 0 1⎠

Here are the next two n-edge walk matrices:

     ⎛0 0 0 5⎞
A² = ⎜0 0 0 2⎟
     ⎜0 0 0 1⎟
     ⎝0 0 0 1⎠
     ⎛0 0 0 5⎞
A³ = ⎜0 0 0 2⎟
     ⎜0 0 0 1⎟
     ⎝0 0 0 1⎠

We can also add these matrices:

              ⎛0 1 2 11⎞
A + A² + A³ = ⎜0 0 0  6⎟
              ⎜0 0 0  3⎟
              ⎝0 0 0  3⎠

This matrix tells us that from 1 to 4 there are eleven walks of no more than three edges.

Adjacency matrices can give us a useful way to reason about graphs without having to traverse every walk.


These concepts give us the basic tools for working with graph databases. In future posts I will look at how we can put them to work to model domains.

Creativity in Software Development

I shared yesterday’s post with some friends, who were keen to explore what we mean when we talk about creativity in software development.

Alastair made an interesting comment:

…it made me reconsider software dev as a creative endeavour, but I think I came to the conclusion that it is. For me, I think there is a gap between a creative art like writing, especially one which has an expressive mirror like acting, and a purely creative activity like, e.g., whittling a stick or constructing a building.

I think there is value in disentangling our concepts of creativity, and I find Alastair’s distinction between the creative arts and simpler forms of creation very useful.

There’s also an ambiguity in the word ‘create’, as it can refer simply to making things, as well as to the creative endeavours we would like to characterise.

So rather than ask ‘Is software development a creative activity?’, I tend to consider a narrower question: ‘Is there a place for creative thinking in software development?’

As the most basic level, I see creative thinking as making new links between concepts. Once you have made the link, you can engage other thought processes, for example deductive thinking, to explore the consequences and implications of that link.

But because the link isn’t already there, you can’t find it by rational thought; you need a leap of imagination to reach it.

There are some sorts of problem that I can tackle best once I’ve slept. On a few lucky occasions I’ve been able to take an afternoon nap, and woken up with a new idea to investigate, but this usually means taking the idea home with me and letting it brew overnight.

Here are a few examples of problems in software development that can be tackled with creative thinking:

  • How should we name this element?
  • What is the appropriate metaphor for this system?
  • Has a similar problem already been solved? Is there a pattern we can apply here?
  • What test should we write first? What test should we write next?
  • What is the best way to split this system into smaller parts?

And of course, because software development in an organisation is a social activity, the need for creative thinking extends far beyond the design of the software.

Test code needn’t be defensive

In a code review I encountered some test code that looked a bit like this:

 var result = await _controller.Resources() as ViewResult;
 // ReSharper disable once PossibleNullReferenceException

This is a typical defensive coding pattern whose reasoning goes like this:

  • The return type of _controller.Resources() is Task<ActionResult>.
  • I need to cast the inner Result of this Task to a ViewResult, as I want to inspect its Model attribute.
  • But the Result could be a different subclass of ActionResult, so I had better use a safe cast, just in case.
  • As I’m using a safe cast, I can’t guarantee that I’ll get any instance back, so I had better do a null check.
  • Oh look! ReSharper is complaining when I try to access properties of this object. As I’ve already performed a null check, I’ll turn off the warnings with a comment.

Now, defensive coding styles are valuable when we don’t know what data we’ll be handling, but this is most likely to happen at the boundaries of a system, where it interacts with other systems or, even more importantly, humans.

But in the context of a unit test, things are different:

  • We are in control of both sides of the contract: the test and class under test have an intimate and interdependent existence. A different type of response would be unexpected and invalid.
  • An attempt to directly cast to an invalid type will throw a runtime error, and a runtime error is a meaningful event within a test. If _controller.Resources() returns any other subclass of ActionResult, then the fact that it cannot be cast to ViewResult is the very information I want to receive, as it tells me how my code is defective.

This means I can rewrite the code like this:

var result = (ViewResult) await _controller.Resources(); 

By setting aside the defensive idiom, I’ve made the test clearer and more precise, without losing any of its value.

How applying Theory of Constraints helped us optimise our code

The neck of a bottle of prosecco in front of a fire.

My team have been working on improving the performance our API, and identified a database call as the cause of some problems.

The team suggested three ways to tackle this problem:

  • Scale up the database till it can meet our requirements.
  • Introduce some light-weight caching in the application to reduce load on the database.
  • Examine the query plan for this database call to find out whether the query can be optimised.

Which of these should we attempt first? There was some intense discussion about this, with arguments made in favour of each approach. What we needed was a simple framework for making decisions about how to improve our system.

This is where the Theory of Constraints (ToC) can help. Originally expounded as a paradigm for improving manufacturing systems, ToC is really useful in software engineering, both when managing projects and when improving the performance of the systems we create.

Theory of Constraints

The preliminary step in applying ToC is to identify the Goal of your system. In the case of this API, the Goal is to supply accurate data to consumers.

Now we understand the Goal of the system, we can define the Throughput of the system as the rate at which it can deliver units of that goal, in our case API responses. We can also define the Operating Expenses of the system (the cost of servers) and its Inventory (requests waiting for responses).

The next step is to identify the Constraint of the system. This is the element in the system that dictates the system’s Throughput. In a physical system, a useful heuristic is a build-up of Inventory in front of this element. In our API, our monitoring helped us pinpoint the bottleneck.

The next three steps give us a sequence of approaches for tackling the Constraint:

  • First, Exploit the Constraint by finding local changes you can make to improve its performance.
  • Second, Subordinate the rest of the system to the Constraint by finding ways to reduce pressure on it so it can perform more smoothly.
  • Third, Elevate the Constraint by increasing the resources available to it, committing to additional Operating Expenses if necessary.

Exploitation comes first because it’s quick, cheap and local. To Subordinate you need to consider the effects on the rest of the system, but there shouldn’t be significant costs involved. Elevating the Constraint may well cost a fair amount, so it comes last on the list.

Once you have applied these steps you will either find that the Constraint has moved elsewhere (you’ve ‘broken’ the original Constraint), or it has remained in place. In either case, you should repeat the steps as part of a culture of continuous improvement. Eventually you want to see the constraint move outside your system and become a matter of consumer demand.

Applying ToC to our question

If we look at the team’s three suggestions, we can see that each corresponds to one of these techniques:

  • Scaling up the database is Elevation: there’s a clear financial cost in using larger servers.
  • Introducing caching is Subordination: we’re changing the rest of the system to reduce pressure on the Constraint, and need to consider questions such as cache invalidation before we make this change.
  • Optimising the query is Exploitation: we’re making local changes to the Constraint to improve its performance.

Applying ToC tells us which of these approaches to consider first, namely optimising the query. We can look at caching if an optimised query is still not sufficient, and scaling should be a last resort.

In our case, query optimisation was sufficient. We managed to meet our performance target without introducing additional complexity to the system or incurring further cost.

Further Reading

Goldratt, Eliyahu M.; Jeff Cox. The Goal: A Process of Ongoing Improvement. Great Barrington, MA.: North River Press.