Algorithm Complexity – A refresher

As a Development Manager you have often dealt with challenging business or purely technical solutions that you want to lead your team to implement. Sure, you have always come up with a strategy but sometimes upon reviewing the result, you may have realised that the algorithm approach was inappropriate in terms of scalability. And it may be too late if you find out during UAT or worse, after some months in production.

Ideally you want to capture such an issue early in the design phase but how do you assess an abstract application (aka technical design)? The answer comes from one of the fundamental topics of Computer Science, Algorithm Complexity, In this article I will cover some of the principal concepts of this subject that may serve as a refresher for some or a good intro for others.

Complexity is of particular interest in virtual reality

THE BASICS

Let’s start by going back to basics. As you probably know any software is ultimately translated into a series of instructions for the Central Processing Unit (CPU). In the olden days it used to be that programming was simply telling directly the processor what to do, the so-called first generation programming languages which did not require any compiler. As things evolved (and that means hardware, computer science and human intelligence) we arrived to third, fourth and fifth generation but still all these are translated to machine language and instructions for the CPU.

Let’s see some interesting aspects of this through an example. Take a Java language program, that does the following:

What this is really doing is 1 instruction that is assigning a value. Now take a look at this:

How many instructions do we have here?

  1. Initializing the array ab
  2. Assigning the variable s
  3. Assigning the variable i
  4. x5 checking the condition i < ab.length
  5. x5 assigning a new value to the variable i (increase by 1)
  6. x5 accesing the array (assuming array search is 1 instruction)
  7. x5 assigning the value of the item in the array to variable s

Total: (4×5) +3 = 23

Now say we were to alter the array and add 5 more letters so the  condition would stop after 10 iterations i.e. we would have

(4×10) +3 = 43.

So one thing we observe here is that for the above snippet there is a constant overhead of 2 operations, at least 4 operations (assuming the array has at least 1 element) and the total will depend on the number n of loops that will be executed. If we wanted to express this as a function it would be: ƒ(n) = 4n + 3

OK, so now that we have a mathematical way to express the number of instructions that will be executed for the above snippet let’s bring it to the next level.

GETTING COMPLEX – the Big O

In practice you usually have a much more convoluted code to deal with and the above equation might become very complex indeed. So in reality what you want is to find a way that will simplify such equations helping you focus on how efficient your algorithm is without having to solve complicated equations.

Let’s take a look again in our equation ƒ(n) = 4n + 3

Given that we will always have the overhead of 3 instructions at the beginning of the algorithm we can drop this term of this equation as it does not have a significant impact in the final number of instructions. Thus we have:

ƒ(n) = 4n

(Note: in genereal, should there be another term that increases slower than 4n then we could drop it)

Also, the multiplier 4 is constant and therefore for the sake of simplicity can be dropped i.e.

ƒ(n) = n

What this tells us is what we suspected from the beginning: The efficiency of our code will purely depend on the number of loops n and is linear as the number of items in the array increases.

In Computer Science this practice of analysing the complexity of the algorithm by simplifying the equation is called asymptotic analysis and often this is expressed as O(ƒ(n)) which is known as the Big-O.

So in the example we have heer the Big-O would be expressed as O(n)

Let’s see some more interesting things on this subject:

  • A single instruction would be O(1)
  • A code with 1 nested loop (a loop within a loop) with the same number of iterations would be O(n²). Why? Because for each iteration of the outer loop, n instructions need to be executed sox n = n2
  • A code with m nested loops with the same number of iterations would be O(nm)
  • The complexity of a recursive code (like the below) to calculate the nth element of a Fibonnaci series would be O(2n). I trust you can work out easily how many instructions need to be executed to find the 100th element: it is 1,267,650,600,228,229,401,496,703,205,376  (Challenge: take your favourite processor, see how many million instructions per second (MIPS) it executes and you will see when it will arrive to a solution. Hint: more than the age of the universe). Note though that there are other more efficient ways to search the Fibonnaci series.

 

  • For a Binary Search (one of my next articles is dedicated to it) it is O(log2n) which makes it a very efficient algorithm when it comes to search in a very large array. I you plot this in a diagram you will notice that as the number increase the curve plateaus as opposed to a linear function.

Conclusion: The higher the O(ƒ(n)) the higher the complexity i.e. more instructions need to be executed and the slower our algorithm.

 

THE WORST CASE SCENARIO

Now as you know the chunks of code out there, packed with their loops and recursive calls, do not always execute all possible instructions. Why? Because somewhere in the code we may have that condition that will break the execution of a loop and move on with the next bit of code.

Consider the following change in our original piece of code which makes it a search algorithm (linear search)

The above newly introduced “if” statement on line 6 will have no effect on how many times the loop will run since the search term “f” is the last element in the array. So it will still run the maximum of iterations i.e. 5 times. However, if we change the order and make “f” the third element and “c” the fifth then the loop  will run 3 times.

To generalize this, therefore, in the worst-case (does not find the letter or the letter is the last in the array) the loop will run n times. This is a useful metric when comparing algorithms since it can give with a good degree of confidence which will perform better in extreme cases i.e. which will scale better.

Putting this into practice, for a sorted list if you were to compare the linear search we just saw which has a worst-case  complexity of O(n) and the binary search which has a worst-case  complexity of O(log2n) you can immediately see that for an array of 1000 elements we would need 1000 instructions in the former case and roughly 10 (actually 9.97) for the latter. A substantial difference!!!

 

CONCLUSION

So there you have it. What I described in this article is just a taster on the subject of complexity. However, with my last example, I hope I highlighted the importance of measuring the complexity of an algorithm. Not only your team will go in the mode of producing scalable applications but also you can save yourself from wasting resources to redesign an application or keep spending money buying an even “faster” kit to run your platform on.

One thought on “Algorithm Complexity – A refresher
  1. That was really helpful! I was really confused in calculations regarding time complexity…

    Thanks to you for explaining the concept of Time complexity, George… 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.