Welcome to the second part of the series in which I will provide an analysis of the average-case running time of linear search. This turns out to be so much more fun than what I have expected. Recall that we are considering three cases depending on how many times the element **x** appears in **A**.

Assuming that there is only one index **i** such that **A[i] = x**, what is the average-case running time of linear search?

There are **n** different positions for **x** to appear in **A**, each of them is equally likely to occur. For each **1<=i<=n**, if **A[i] = x** then the running time is **i**. Computing the average over all **i** we get: **Sum(i)/n = n(n+1)/2n = (n+1)2**. We can also use the formula of expectation to get the same result.

Assuming that there are no indices such that **A[i] = x**, what is the average-case running time of linear search?

Since we *always* have to compare all elements of **A**, we make **n!*n/n! = n** comparisons. Now to the most interesting case.

Assuming that there are **k >= 1** indices such that **A[i] = x**, what is the average-case running time of linear search?

If we think about this for a moment, we can logically guess **(n+1)/(k+1)** to the be the answer. The indices are randomly distributed and so, on average, we expect that they are equally distanced from each other. Therefore, on average, we only have to examine elements up to the first occurrence of **x** which appears approximately at **(n+1)/(k+1)**. Let’s see whether our analysis coincides with our intuition.

Let **T** be the random variable representing the running time of linear search on input **A** of size **n**. We wish to compute the expected value of **T**, namely **E[T]**. We will be using the formula of expectation. That is:

E[T] = 1*Pr{T=1} + 2*Pr{T=2} + … + n*Pr{T=n}.

Before applying this formula, we have to compute **Pr{T=i}** for each **i**. It turns out that’s not quite so simple.

When is the running time equals **i**? Or in other words, when do we make **i** comparisons to find **x**? That happens when the first copy of **x** appears at position **i** and all other copies of **x** appear after it. In how many permutations does this situation occur of all the n! permutations?

The first **x** should be at index **i**. No **x** appears in the subarray **1<=j<i**. And exactly **k-1** **x**‘s appear in the subarray **i<j<=n**. There are ways to arrange the **k-1** copies of **x** in the subarray. In each of these arrangements, there are **K**! ways to order these copies and **(n-k)**! ways to order the other elements. Putting all of this together by the using the rule of product, we find that the total number of such permutations is **C(n-i, k-1)*k!*(n-k)!**.

To get the probability **Pr{T=i}**, we divide by **n!**, the total number of permutations to get **C(n-i, k-1)/C(n, k)**. Substituting this in the formula of expectation:

E[T] = (1/C(n, k)) * Sum(i*C(n-i, k-1)) over all **1<=i<=n**.

Now I will state the following Lemma which I will prove later.

Lemma: Sum(i*C(n-i, k-1)) = C(n+1, k+1) over all **1<=i<=n**.

So,

E[T] = C(n+1, k+1)/C(n, k) = (n+1)/(k+1). Hence our intuition is correct.

In Part 3 of the series I will show an easier way to derive the average-case running time of linear search. And in Part 4, I will prove the lemma.