Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Probability to Get a Complete Deck of Cards?

Suppose a poker games requires 4 decks of poker. After the game, the cards are put back into the 4 decks randomly. Now there is another game which requires only 1 deck of poker, so we want to open some decks of the poker and find a complete set of cards. Since people are lazy, we do not want to open all the 4 decks of poker. If we open 2 decks of cards, what is the probability that we can get a complete set of cards?

It is hard to solve this specific problem directly. The viewpoint of treating a deck of cards as a whole is especially a barrier for solving this problem. Here is the fancy idea. We can generalize this problem as follows. Suppose there are NN distinct cards in a deck of poker. Now MM decks of pokers are mixed together and then put back. If we draw nn cards randomly, what is the probability that we get mm types (cards of the same color and suit are consider to be the same type) of cards? Now to answer the question that how much chance we can get a complete set of cards if we open kk decks of poker, we just need to calculate the probability of getting NN types of cards, if we draw kNkN cards randomly from these mixed cards.

Let’s use An,mA_{n,m} to stand for the event of getting mm types of cards when nn cards are randomly drawn, and Pn,mP(An,m)P_{n,m}\equiv P(A_{n,m}). Then conditioning on the number of types of cards we get in the first n1n-1 draws, we have the following recursion formula

Pn,m=Pn1,mmMn+1NMn+1+Pn1,m1(Nm+1)MNMn+1.P_{n,m}=P_{n-1,m}\frac{mM-n+1}{NM-n+1}+P_{n-1,m-1}\frac{(N-m+1)M}{NM-n+1}.

We can find a formula for general terms using method of generating functions. Since there are two subscript changing at the same time, it is much harder to solve this recursion formula. However, as I mentioned before, we have actually solved the problem from practical view because we have recursion formula and we know the initial values. We can write a program to help us solve this problem. The following is an implementation of the recursion formula in Mathematica.

GeneralizedHyperGeometryProbability::illegal = "The number of selected exeeds the total number.";
(*
@param nselected: the number of card selected;
@param ndistinct: the number of distinct card in the selected card;
@param Ndistinct: the total number of distinct card;
@param Ncopies: the copies of poker.
*)
GeneralizedHyperGeometryProbability[nselected_, ndistinct_, Ndistinct_, Ncopies_] := Module[
    {TotalCardsNumber, TotalChoices, i, j, result = {}, n, nleft},
    TotalCardsNumber = Ndistinct*Ncopies;
    If[TotalCardsNumber < nselected, 
        Message[GeneralizedHyperGeometryProbability::illegal]; 
        Return[False]
    ];
    If[Ndistinct < ndistinct, Return[0]];
    If[nselected < ndistinct, Return[0]];
    If[ndistinct * Ncopies < nselected, Return[0]];
    If[ndistinct == 1,
        Return[Ndistinct * Binomial[Ncopies, nselected] / Binomial[TotalCardsNumber, nselected]]
    ];
    TotalChoices = Binomial[TotalCardsNumber, nselected];
    If[nselected == ndistinct,
        Return[Binomial[Ndistinct, ndistinct]*Ncopies^ndistinct/TotalChoices]
    ];
    For[i = 1, i <= ndistinct, i++,
        AppendTo[result, 
        GeneralizedHyperGeometryProbability[i, i, Ndistinct, Ncopies]]
    ];
    For[i = 2, i <= nselected - ndistinct + 1, i++,
        result[[1]] = GeneralizedHyperGeometryProbability[i, 1, Ndistinct, Ncopies];
        For[j = 2, j <= ndistinct, j++,
            n = j - 1 + i;
            nleft = TotalCardsNumber - n + 1;
            result[[j]] = result[[j]]*(j*Ncopies - n + 1)/nleft + result[[j - 1]]*(Ndistinct - j + 1)*Ncopies/nleft
        ]
    ];
    Return[result[[ndistinct]]]
]

In[5]:= N[GeneralizedHyperGeometryProbability[54*1, 54, 54, 4]]
Out[5]= 9.19323*10^-20
In[6]:= N[GeneralizedHyperGeometryProbability[54*2, 54, 54, 4]]
Out[6]= 0.0190881
In[7]:= N[GeneralizedHyperGeometryProbability[54*3, 54, 54, 4]]
Out[7]= 0.820296

And the following is another implementation of the recursion formula in R.

#'
#' @param nselected the number of card selected;
#' @param ndistinct the number of distinct card in the selected card;
#' @param Ndistinct the total number of distinct card;
#' @param Ncopies the copies of poker.
#'
pghyper = function(nselected, ndistinct, Ndistinct, Ncopies){
    TotalCardsNumber = Ndistinct*Ncopies
    if(TotalCardsNumber < nselected)
        stop("the number of selected exceeds the total number.")
    if(Ndistinct < ndistinct)
        return(0)
    if(nselected < ndistinct)
        return(0)
    if(ndistinct * Ncopies < nselected)
        return(0)
    TotalChoices = choose(TotalCardsNumber, nselected)
    if(ndistinct == 1)
        return(Ndistinct * choose(Ncopies, nselected) / TotalChoices)
    if(nselected == ndistinct){
        dchoices = choose(Ndistinct, ndistinct)
        multiple = Ncopies ^ ndistinct
        if(dchoices < multiple)
            return(multiple / (TotalChoices / dchoices))
        else
            return(dchoices / (TotalChoices / multiple))
    }
    result = NULL
    for(i in 1:ndistinct)
        result = c(result, pghyper(i, i, Ndistinct, Ncopies))
    for(i in 2:(nselected-ndistinct+1)){
        result[1]=pghyper(i,1,Ndistinct,Ncopies)
        for(j in 2:ndistinct){
            n=j-1+i
            nleft=TotalCardsNumber-n+1
            result[j]=result[j]*(j*Ncopies-n+1)/nleft
            +result[j-1]*(Ndistinct-j+1)*Ncopies/nleft
        }
    }
    result[ndistinct]
}

> pghyper(54*1,54,54,4)
[1] 9.19323e-20
> pghyper(54*2,54,54,4)
[1] 0.01908814
> pghyper(54*3,54,54,4)
[1] 0.8202961

From the above result, we can see that in the original problem the probability is only about 2%2\% for us to get a complete set of cards if we only open 2 decks of pokers while the probability is about 82%82\% if we open one more deck of poker (i.e. open 3 decks of pokers). So for a really lazy person, it seems that to open 3 decks of poker is a good choice. Let f(k)=P54k,54,k=1,,4f(k)=P_{54k,54}, k=1,\ldots, 4. The above result also suggests us that function f(k)f(k) is a very odd function. It is small when kk is smaller and then suddenly increases to a (relative) very big value, and then it increases mildly to 1. Now suppose there are 20 decks of poker (each has 54 cards) involved, let’s see how function f(k)=P54k,54f(k)=P_{54k,54}, k=1,,20k=1,\ldots, 20, behaves. The plot of the function f(k)=P54k,54f(k)=P_{54k,54}, k=1,,20k=1,\ldots, 20, is shown is in the following figure. We can see that for very small kk, f(k)f(k) is very small; then f(k)f(k) increase dramatically to a (relative) very big value and then it increase mildly to 1. For other number of decks of poker, f(k)f(k) has similar properties. So in this kind of problems, definitely we will not have a big probability to success if open 1 or 2 decks of poker. However, we do not have to open too many decks of poker, e.g., 4 or 5 decks would yield a remarkable success probability even if many (e.g., 100) decks of pokers are involved. Probability