# algorithms – What is O(…) and how do I calculate it?

### What are the asymptotic functions? What is an asymptote, anyway?

Given a function f(n) that describes the amount of resources (CPU time, RAM, disk space, etc) consumed by an algorithm when applied to an input of size of n, we define up to three asymptotic notations for describing its performance for large n.

An asymptote (or asymptotic function) is simply some other function (or relation) g(n) that f(n) gets increasingly close to as n grows larger and larger, but never quite reaches. The advantage of talking about asymptotic functions is that they are generally much simpler to talk about even if the expression for f(n) is extremely complicated. Asymptotic functions are used as part of the bounding notations that restrict f(n) above or below.

(Note: in the sense employed here, the asymptotic functions are only close to the original function after correcting for some constant nonzero factor, as all the three big-O/Θ/Ω notations disregard this constant factors from their consideration.)

### What are the three asymptotic bounding notations and how are they different?

All three notations is used like this:

f(n) = O(g(n))

where f(n) here is the function of interest, and g(n) is some other asymptotic function that you are trying to approximate f(n) with. This should not be taken as an equality in a rigorous sense, but a formal statement between how fast f(n) grows with respect to n in comparison to g(n), as n becomes large. Purists will often use the alternative notation f(n) ∈ O(g(n)) to emphasize that the symbol O(g(n)) is really a whole family of functions that share a common growth rate.

Big-ϴ (Theta) notation states an equality on the growth of f(n) up to a constant factor (more on this later). It behaves similar to an `=` operator for growth rates.

Big-O notation describes an upper-bound on the growth of f(n). It behaves similar to a `≤` operator for growth rates.

Big-Ω (Omega) notation describes a lower-bound on a growth of f(n). It behaves similar to a `≥` operator for growth rates.

There are many other asymptotic notations, but they do not occur nearly as often in computer science literature.

Big-O notations and its ilk are often as a way to compare the time complexity.

### What is time complexity?

Time complexity is a fancy term for the amount of time T(n) it takes for an algorithm to execute as a function of its input size n. This can be measured in the amount of real time (e.g. seconds), the number of CPU instructions, etc. Usually it is assumed that the algorithm will run on your everyday von Neumann architecture computer. But of course you can use time complexity to talk about more exotic computing systems, where things may be different!

It is also common to talk about space complexity using Big-O notation. Space complexity is the amount of memory (storage) required to complete the algorithm, which could be RAM, disk, etc.

It may be the case that one algorithm is slower but uses less memory, while another is faster but uses more memory. Each may be more appropriate in different circumstances, if resources are constrained differently. For example, an embedded processor may have limited memory and favor the slower algorithm, while a server in a data center may have a large amount of memory and favor the faster algorithm.

### Calculating Big-ϴ

Calculating the Big-ϴ of an algorithm is a topic that can fill a small textbook or roughly half a semester of undergraduate class: this section will cover the basics.

Given a function f(n) in pseudocode:

``````int f(n) {
int x = 0;
for (int i = 1 to n) {
for (int j = 1 to n) {
++x;
}
}
return x;
}
``````

What is the time complexity?

The outer loop runs n times. For each time the outer loop runs, the inner loop runs n times. This puts the running time at T(n) = n2.

Consider a second function:

``````int g(n) {
int x = 0;
for (int k = 1 to 2) {
for (int i = 1 to n) {
for (int j = 1 to n) {
++x;
}
}
}
return x;
}
``````

The outer loop runs twice. The middle loop runs n times. For each time the middle loop runs, the inner loop runs n times. This puts the running time at T(n) = 2n2.

Now the question is, what is the asymptotic running time of both functions?

To calculate this, we perform two steps:

• Remove constants. As algorithms increase in time due to inputs, the other terms dominate the running time, making them unimportant.
• Remove all but the largest term. As n goes to infinity, n2 rapidly outpaces n.

They key here is focus on the dominant terms, and simplify to those terms.

T(n) = n2 ∈ ϴ(n2)
T(n) = 2n2 ∈ ϴ(n2)

If we have another algorithm with multiple terms, we would simplify it using the same rules:

T(n) = 2n2 + 4n + 7 ∈ ϴ(n2)

The key with all of these algorithms is we focus on the largest terms and remove constants. We are not looking at the actual running time, but the relative complexity.

### Calculating Big-Ω and Big-O

First off, be warned that in informal literature, “Big-O” is often treated as a synonym for Big-Θ, perhaps because Greek letters are tricky to type. So if someone out of the blue asks you for the Big-O of an algorithm, they probably want its Big-Θ.

Now if you really do want to calculate Big-Ω and Big-O in the formal senses defined earlier, you have a major problem: there are infinitely many Big-Ω and Big-O descriptions for any given function! It’s like asking what the numbers that are less than or equal to 42 are. There are many possibilities.

For an algorithm with T(n) ∈ ϴ(n2), any of the following are formally valid statements to make:

• T(n) ∈ O(n2)
• T(n) ∈ O(n3)
• T(n) ∈ O(n5)
• T(n) ∈ O(n12345 × en)
• T(n) ∈ Ω(n2)
• T(n) ∈ Ω(n)
• T(n) ∈ Ω(log(n))
• T(n) ∈ Ω(log(log(n)))
• T(n) ∈ Ω(1)

But it is incorrect to state T(n) ∈ O(n) or T(n) ∈ Ω(n3).

### What is relative complexity? What classes of algorithms are there?

If we compare two different algorithms, their complexity as the input goes to infinity will normally increase. If we look at different types of algorithms, they may stay relatively the same (say, differing by a constant factor) or may diverge greatly. This is the reason for performing Big-O analysis: to determine if an algorithm will perform reasonably with large inputs.

The classes of algorithms break down as follows:

• Θ(1) – constant. For example, picking the first number in a list will always take the same amount of time.

• Θ(n) – linear. For example, iterating a list will always take time proportional to the list size, n.

• Θ(log(n)) – logarithmic (base normally does not matter). Algorithms that divide the input space at each step, such as binary search, are examples.

• Θ(n × log(n)) – linear times logarithmic (“linearithmic”). These algorithms typically divide and conquer (log(n)) while still iterating (n) all of the input. Many popular sorting algorithms (merge sort, Timsort) fall into this category.

• Θ(nm) – polynomial (n raised to any constant m). This is a very common complexity class, often found in nested loops.

• Θ(mn) – exponential (any constant m raised to n). Many recursive and graph algorithms fall into this category.

• Θ(n!) – factorial. Certain graph and combinatorial algorithms are factorial complexity.

### Does this have anything to do with best/average/worst case?

No. Big-O and its family of notations talk about a specific mathematical function. They are mathematical tools employed to help characterize the efficiency of algorithms, but the notion of best/average/worst-case is unrelated to the theory of growth rates described here.

To talk about the Big-O of an algorithm, one must commit to a specific mathematical model of an algorithm with exactly one parameter `n`, which is supposed to describe the “size” of the input, in whatever sense is useful. But in the real world, inputs have much more structure than just their lengths. If this was a sorting algorithm, I could feed in the strings `"abcdef"`, `"fedcba"`, or `"dbafce"`. All of them are of length 6, but one of them is already sorted, one is reversed, and the last is just a random jumble. Some sorting algorithms (like Timsort) work better if the input is already sorted. But how does one incorporate this inhomogeneity into a mathematical model?

The typical approach is to simply assume the input comes from some random, probabilistic distribution. Then, you average the algorithm’s complexity over all inputs with length `n`. This gives you an average-case complexity model of the algorithm. From here, you can use the Big-O/Θ/Ω notations as usual to describe the average case behavior.

But if you are concerned about denial-of-service attacks, then you might have to be more pessimistic. In this case, it is safer to assume that the only inputs are those that cause the most amount of grief to your algorithm. This gives you a worst-case complexity model of the algorithm. Afterwards, you can talk about Big-O/Θ/Ω etc of the worst-case model.

Similarly, you can also focus your interest exclusively to the inputs that your algorithm has the least amount of trouble with to arrive at a best-case model, then look at Big-O/Θ/Ω etc.