Binomial Distribution and Random Walks

We start by considering the following problem and then show how it relates to the binomial distribution.

Example 1: Suppose you start at point 0 and walk either one unit to the right or one unit to the left, where there is a 50-50 chance of either choice. Now repeat this process 100 times. What is your expected distance from the starting point?

Basic Concepts

This is a simple form of what is called a random walk problem. Define the random variables xi as follows:

x_i formula

Now let dn = your distance from the starting point after the nth trial. Thus

image7197

To solve Example 1 we need to find the expected value E[d100]. We accomplish this by using Properties 1 and 2.

Properties

Property 1: If xN(0, σ2), then 

Mean of |x|

Proof: See proof of Property 5 of Advanced Properties of Probability Distributions.

Property 2: For n sufficiently large

Mean of d_n

Proof: Let {x1, …, xn} be a random walk for x = \sum_{i=1}^n.  First we note that the mean of each xi is (.5(1) + .5(-1))/2 = 0 and the variance is ((1-0)2 +(-1-0)2)/2 = 1; i.e. xiN(0, 1). 

The random walk can therefore be considered to be a random sample from a standard normal distribution, and so by Corollary 1 of the Central Limit Theorem, for n sufficiently large, the sample mean of x is approximately normally distributed with mean 0 and standard deviation 1/\sqrt{n}. But the sample mean of x is \frac{1}{n} \sum_{i=1}^n x_i. Thus

Mean x standard normal

By Property 1 of Basic Characteristics of the Normal Distribution, it follows that

Sum of x_i distribution

and so by Property 1

image7202Examples

Example 1 (continued): We can now use Property 2 to complete Example 1, as follows:

Expectation d_100

and so we see that on average your distance from the starting point is just under 8 units. Sometimes you end up to the right of the starting point and sometimes you end up to the left of the starting point (and sometimes, although far less often, you end up right back at the starting point).

Example 2: Suppose you toss a fair coin 100 times. By Property 1 of Binomial Distribution, the average number of heads is 100(.5) = 50 and the average number of tails is 50. But we know that 100 tosses won’t always result in 50 heads and 50 tails. So on average how many more or fewer heads are there than the average of 50?

It is easy to see that the answer to this question is the average value of |# of heads – 50|. But what is this value?

If we view getting heads as going right and getting tails as going left in our random walk problem, we see that the solution to this example is also E[d100]. In general, the average number of heads after n tosses is n/2 and so

the average value of |# of heads after n tosses – n/2| = E[dn]

Since

Expectation d_100

we see that the answer to Example 2 is again just under 8. In fact, this means that not only is it not necessarily the case that heads and tails break 50-50 in 100 tosses but on average heads are 8 units away from 50, i.e. on average, heads and tails break 58-42 (although not necessarily is the 58 for heads).

Observation

Note that for any a and b, |a| < b is equivalent to –a < b < a. Since, as we have seen above, \sum_{i=1}^n x_i ~ N(0,\sqrt n), it follows that the probability that dn = |\sum_{i=1}^n x_i| < k for any value of k is NORM.DIST(k, 0, SQRT(n), TRUE) – NORM.DIST(-k, 0, SQRT(n), TRUE) = 2*NORM.DIST(k, 0, SQRT(n), TRUE)–1.

For example, P(d100 < 7) = .758 – .242 = .516.

References

Hoel, P. (1962) Introduction to mathematical statistics, 3rd Ed. John Wiley and Sons

Tijms, H. (2007) Understanding probability: chance rules in everyday life (2nd ed.). Cambridge: Cambridge University Press.

http://scholar.google.it/scholar_url?url=http://personal.vu.nl/h.c.tijms/SampleUP.pdf&hl=en&sa=X&ei=R1HXYPKlOM6Fy9YPgKCj4Ak&scisig=AAGBfm2n1zaissHo6sIEY1rVxlZHprY2Mg&nossl=1&oi=scholarr

9 thoughts on “Binomial Distribution and Random Walks”

  1. How do you solve this problem. A coin is tossed 7 times. Your beginning position is at zero. If you get heads you go right one unit from zero. If you get tails you go left one unit. What is the probability that you will return to zero at least once?

    Reply
    • Pete,
      I am sure that you can solve this problem in the same way, but here is a straighforward approach:
      1. It doesn’t matter which direction you go on the first toss and so I will assume it is left L. Note too that it doesn’t matter what direction you go on the 7th toss. Also after 4 lefts you can’t succeed.
      2. The only sequences that are successful are LR, LLRR, LLLRRR, LLRLRR
      3. The probabilities of these 1/2, 1/8, 1/32, 1/32, for a total of 11/16.
      Charles

      Reply
  2. Hi Charles

    I am still unable to build up the connections between average value of |# of heads – 50| and E[dn]. Could you explain more specifically how they match up?

    As you pointed out that if we let x be the random variable which takes the value 1 if heads is thrown and -1 if tails is thrown, then one step random walk is a Bernoulli trail and 100-time repetitions generate binomial distribution in the first place, so why do we need all those coin tossing stories?

    Reply
    • Jeff,
      d_100 = the distance from the origin that you arrive at after 100 random walks (going left has a 50% probability and going right has a 50% probability). But it also turns out that d_100 = the number of heads – 50 after 100 tosses (getting heads has 50% probability and getting tails has a 50% probability), assuming more heads than tails are tossed. You can think of heads as equivalent to going right and tails as equivalent to going left,
      Charles

      Reply
  3. Hi! I know for a random walk, at least in a 2 dimensional space, with a constant “step” length and ANY direction possible the expected “stray” from the starting point is directly proportional to the square root of the number of steps taken. My question is this: what is the Distribution of distances strayed from the starting point for this model?

    Reply
  4. Hi Charles,

    This is great. Thank you for putting this great resource together. However, the link “By Property 5 of Properties of Distributions Detailed (the proof of which uses calculus)” is not returning the desired page. Please can you fix it. I am really keen to see a neat proof of this.

    Thanks & Regards
    Prab

    Reply

Leave a Comment