CHALLENGE: Prove that Σ k * C[M,k] * C[(N-M),(n-k)] / C[N,n] = M*n/N?
Can anyone prove the following statement? (or falsify it if you don’t believe it?)
Σ k * C[M,k] * C[(N-M),(n-k)] / C[N,n] = M*n/N
a = max (0, n+M-N)
b = min (n, M)
N, M and n are any positive integral constants satisfying
N ≥ M, n (Є Z+)
k (not n) is the index variable.
C[i,j] = iCj = i!/[j!(i-j)!]
NOTE: The limits of the summation, a and b, are only intended to make the nCr function meaningful. k would normally go from 0 to n, but k must never exceed M, and n-k must never exceed N-M.
- DukeLv 71 decade agoFavorite Answer
I saw a similar Dr D's question yesterday.
The notation below is the same as in the question.
Imagine you have urn with N balls (M white, N - M black) and You draw (either SIMULTANEOUSLY or one by one WITHOUT replacements) n of them at random /M, N, n satisfying the conditions in the question/; then the probability P(k) in the extraction to be exactly k white and n-k black is given by the well-known hyper-geometric formula:
P(k) = C[M,k] * C[(N-M),(n-k)] / C[N,n], /numerator - favorable cases to select k out of M white and n-k out of N-M black; denominator - possible cases to draw n out of N/.
Consider X - the random number of the white balls drawn,
a ≤ X ≤ b. This random variable has hyper-geometric distribution, whose mathematical expectancy by definition is the following sum, where a ≤ k ≤ b:
EX = Σ k * C[M,k] * C[(N-M),(n-k)] / C[N,n] = n*p = n*M/N, where p = M/N can be interpreted as a probability to draw a white ball ON THE 1st TRY; if You follow the usual agreement that C[p,q] = 0 when p<q then you can take the sum from 0 to n. That can be proved several ways, the easiest is by using generating functions /F(x) = ΣP(k)*x^k in our case/, the proof can be found in many books, I'll not reproduce it here, I only wished to show that the desired result is in essence obtaining of hyper-geometric distribution's mean value.
P.S. Please compare with my answer to Your yesterday question also.
P.S.(2) Once having linked Your problem with the calculation of the hyper-geometric distribution mean, even more easier approach than using generating functions to do the latter is demonstrated in jw's nice answer to Your yesterday question.Source(s): http://en.wikipedia.org/wiki/Hypergeometric_distri... etc.