An Introduction to the Hypergeometric Distribution
YouTube Viewers YouTube Viewers
202K subscribers
322,441 views
0

 Published On Sep 10, 2013

An introduction to the hypergeometric distribution. I briefly discuss the difference between sampling with replacement and sampling without replacement. I describe the conditions required for the hypergeometric distribution to hold, discuss the formula, and work through 2 simple examples.

I also discuss the relationship between the binomial distribution and the hypergeometric distribution, and a rough guideline for when the binomial distribution can be used as a reasonable approximation to the hypergeometric. I finish with a brief example involving the multivariate hypergeometric distribution.

For those using R, here is the R code to find the probabilities for the examples in this video:

The probability of picking exactly 4 red balls when picking 5 balls from a source containing 6 red and 14 yellow.

Without replacement (hypergeometric):
choose(6,4)*choose(14,1)/choose(20,5)
[1] 0.01354489
or
dhyper(4,6,14,5)
[1] 0.01354489

With replacement (binomial):
dbinom(4,5,6/20)
[1] 0.02835

The probability of picking exactly 7 females when randomly sampling from a school with 1100 female and 900 male students.

Without replacement (hypergeometric):
choose(1100,7)*choose(900,3)/choose(2000,10)
[1] 0.1664901
or
dhyper(7,1100,900,10)
[1] 0.1664901

With replacement (binomial):
dbinom(7,10,1100/2000)
[1] 0.1664783

Multivariate hypergeometric, probability of picking exactly 3 Democrats, 2 Republicans, and 1 independent in the sample.

choose(12,3)*choose(24,2)*choose(8,1)/choose(44,6)
[1] 0.06881377
or, with the extraDistr package installed:
dmvrhyper(c(3,2,1),c(12,24,8),6)

show more

Share/Embed