Asymptotic analysis - the first few couple of things that I can think of is O(1)
, O(n)
, O(logn)
and so on.
Sure, we all know it is used to analyze the running time or memory space that an algortihm takes when executing the program or application. We know that when we say something takes constant time, it means O(1)
while something takes linear time, it's O(n)
. We even know that when we do a binary search, the runtime is actually O(logn)
. I mean, when does this all come from though? How do we calculate it?
Say we take the binary search for example, every time we perform a lookup search, we always skip the other half until we find the target. So to put it in a simple expression, it looks like this:
T(n) = T(n/2) + O(1)
Think of T(n)
as a function that takes in n
records of data. Every time we do some action, the remaining data will be cut into half T(n/2)
in constant time O(1)
. Then we cut the remaining part into another half again, we get:
T(n) = T(n/4) + O(1) + O(1)
And another half:
T(n) = T(n/8) + O(1) + O(1) + O(1)
Eventually we will reach to its smallest unit which is T(1)
. The whole expression will look something like this:
T(n) = T(n/2) + O(1)
= T(n/4) + O(1) + O(1)
= T(n/8) + O(1) + O(1) + O(1)
...
= T(1) + O(1) + O(1) + ... + O(1)
= T(1) + logn * O(1)
= O(logn)
Here we can omit T(1)
= 1 as we will only need to take constant time to solve it. The multiplication of O(logn)
and O(1)
= O(logn)
Let's wrap our head in here. For a dataset of size n, every time we break it down further, we only take a constant operation time to break the data into half. In the end calculation, we only spent O(logn)
time to finish the whole dataset.
Let's see another example here. Supposed we are given the same criteria, except that the time to break it half take O(n)
time instead of O(1)
?
T(n) = T(n/2) + O(n)
= T(n/4) + O(n/2) + O(n)
= T(n/8) + O(n/4) + O(n/2) + O(n)
...
= T(1) + O(1) + O(2) + ... O(n/4) + O(n/2) + O(n)
= O(n + n/2 + n/4 + ... + 1)
= O(2n)
= O(n)
Get it better now? Let's move on. Say we now have T(n)
and each time we execute it, we will end up with 2T(n/2)
in O(n)
time (like sorting). How can we do it?
T(n) = 2T(n/2) + O(n)
= 2[ 2T(n/4) + O(n/2) ] + O(n)
= 4T(n/4) + O(n) + O(n)
= 4[ 2T(n/8) + O(n/4) ] + O(n) + O(n)
= 8T(n/8) + O(n) + O(n) + O(n)
= 8 [ 2T(n/16) + O(n/8) ] + O(n) + O(n) + O(n)
= 16T(n/16) + O(n) + O(n) + O(n) + O(n)
...
= n * T(1) + logn * O(n)
= O(n) + O(nlogn)
= O(nlogn)
This one seems a bit daunting, yeah? But don't worry, the steps are essentially the same as previous one too. Every time we break it down, we get two smaller piece of data subsets. We keep breaking them down and eventually we will reach its smallest unit T(1)
and by that time, there exists O(logn)
of O(n)
. We take the higher precedence one and hence we get O(nlogn)
as answer.
One last question before we end this chapter. Supposed the operation only takes O(1)
for each breakdown, the line of work still works the same:
T(n) = 2T(n/2) + O(1)
= 2 [ 2T(n/4) + O(1) ] + O(1)
= 4T(n/4) + O(2) + O(1)
= 4 [ 2T(n/8) + O(1) ] + O(2) + O(1)
= 8T(n/8) + O(4) + O(2) + O(1)
...
= n * T(1) + O(n + ... + 4 + 2 + 1)
= O(2n)
= O(n)
I hope you guys will find it useful! If there's anything misleading or explaining it wrongly, please feel free to let me know so. Thanks for reading!
Post was published on , last updated on .
Like the content? Support the author by paypal.me!