Sunday, December 20, 2009

On runs of heads and coin tossing

This post was originally inspired by a post on reddit.com discussing Benford's Law.

Problem: what is the probability of finding a run of at least x heads in n tosses of a fair coin?

I borrowed heavily from Schilling [1] to work this out. We're going to recursively count the number of sequences satisfying the problem to give us the probability of finding one during a sequence of tosses.

Let $A(n, x)$ be the number of sequences of length $n$ with a run of at least $x$ heads.

For $n = 3$ and $x = 2$, we have eight total sequences, of which three contain runs of at least 2 heads:

HHH, HHT, HTH, HTT
THH, THT, TTH, TTT

We'll define $A(n, x)$ by constructing successful sequences recursively. If $n < x$ then clearly $A(n, x) = 0$. If $n = x$ then $A(n, x) = 1$ since there is exactly one sequence with at least x heads: the sequence of all heads. If $n > x$ there are three sets of possible sequences:
  1. sequences that start with tails and are followed by subsequences of length $(n-1)$. There are $A(n-1, x)$ such sequences that meet our needs:
    T...    A(n-1, x)
  2. sequences that start with a head and tail and are then followed subsequences of length $(n-2)$. There are $A(n-2, x)$ such subsequences which matter to us:
    HT...   A(n-2, x)
    The same holds for up to $(x-1)$ heads followed by a tail.
    HHT...  A(n-3, x)
    HHHT... A(n-4, x)
    etc.
  3. sequences that start with at least x heads and are followed by $(n-x)$ tosses which we aren't interested in, of which there are $2^{n-x}$:
    HHHHHHH...  2^(n-x)
    |- x -|

So we have a recursive definition to count the number of such sequences:

$ A(n, x) = \left\{ \begin{array}{ll} 0 & n < x \\ 1 & n = x \\ 2^{n-x} + \sum_{i=1}^{x} A(n-i, x) & n > x \\ \end{array} \right. $

Which means the probability of a run of at least x heads in n tosses is given by

$P(n, x) = A(n, x) / 2^n$

For 200 coin tosses the probability of a run of at least five heads is 97% which I think is surprisingly high. The probability of a run of six heads is 80%.

The case of finding a run of exactly $x$ heads is very similar. The only part that changes is the third sequence type, where it must now end with a tails:

  • sequences that have exactly $x$ heads and are followed by a tail and then $(n-x-1)$ tosses which we aren't interested in, of which there are $2^{n-x-1}$:
    HHHHHHHT...  2^(n-x-1)
    |- x -|

This gives us a recursive definition for $B(n, x)$, the probability of finding a run of exactly x heads in a sequence of n tosses:

$ B(n, x) = \left\{ \begin{array}{ll} 0 & n < x \\ 1 & n = x \\ 2^{n-x-1} + \sum_{i=1}^{x} B(n-i, x) & n > x \\ \end{array} \right. $

Again for 200 tosses, the probability of exactly five heads is mucher lower than at least five heads: 48%. Exactly six heads is only 40%.

It should be possible to find a non-recursive definition of A and B since all they're doing is summing over powers of two.

I miss $\LaTeX$, it's pretty.


[1] The Longest Run of Heads, Mark F. Schilling, The College Mathematics Journal, Vol. 21, No. 3, (1990), pp. 196-207, http://mathdl.maa.org/mathDL/22/?pa=content&sa=viewDocument&nodeId=2681

3 comments:

Anonymous said...

rrjo fsuux Porn cryvfs j xt w ami

Anonymous said...

Fantastic publish. I just found your web site and would like to say that I have certainly cherished reading via your blog posts. At any charge I'm going to be subscribing to your feed and that i truly hope you jot down once more quickly.

Anonymous said...

Your second formula is not correct.
Simulations for at least 1 run of exactly 5 in 200 trials is ~80%.
I do not see how your second formula could work. It is more of a combinatorics and counting problem.