Skip to main content

Section 9.3 Multinomial coefficients

In the preceding chapters we discussed binary samples in the sense that an element of our universe \(\Omega\) is either in the sample or isn't. In this chapter, we are going to count the ways to arrange the elements of \(\Omega\) into one of \(k\) states, where \(k\) is a positive integer.

We will answer the following question. Let \([k]=\{1, 2, \ldots, k\}\) for a positive integer \(k\text{.}\) Given a multiset \(\alpha\) of cardinality \(n\) from \([k]\text{,}\) how many ways are there to sort the elements of \(\Omega\) according to that multiset? Recall that a multiset \(\alpha\) associates to each \(i \in [k]\) a number \(\alpha(i)\text{,}\) called its multiplicity. We will write \(\alpha(i)=r_i\) to be consistent with the notation from the preceding chapter. We would like to know how many ways there are to choose \(r_1\) elements of \(\Omega\) to be of the first type, \(r_2\) elements to be of the second type, etc., \(r_k\) elements of the \(k\)-th type, so that \(\sum r_i = n\text{.}\) Furthermore we do not care about the ordering within each type.

Since we are starting with a multiset, why don't we return to our first multiset example?

You are visiting your friend for three days, and they have ten things they want to do while you're in town. Recall you counted there were \(66\) ways (multisets) to distribute the activities across the days. You and your friend decided to do one activity the day you get into town, then two the next day, and two the last day. In other words your chosen multiset is \(\{T^2, F^5, S^3\}\text{.}\)

Now that you have chosen that distribution, how many ways can you choose which activity is done on each day? There are two equivalent ways to answer this question; one will lead us to the other. Let's first pick the first day's activity. There are ten activities and we'd like to choose two; so there are \({{10}\choose{2}}\) options. That leaves eight activities from which to choose five; so, \({{8}\choose{5}}\) ways to pick the second day's activites. Finally there are \({{3}\choose{3}}\) ways to pick the last day's activities. Applying the multiplication principle there are

\begin{equation*} {{10}\choose{2}}{{8}\choose{5}}{{3}\choose{3}} \end{equation*}

ways to choose which activity is done on each day. Applying the formulas for the binomial coefficients, that's

\begin{equation*} \frac{10!}{2!8!}\frac{8!}{5!3!}\frac{3!}{3!0!} \end{equation*}

ways. Observe that we can cancel two of the numerators, and since \(0!=1\text{,}\)

\begin{equation*} \frac{10!}{2!5!3!} = 2,520. \end{equation*}

Now that we've seen the expression

\begin{equation*} \frac{10!}{2!5!3!} \end{equation*}

it suggests an approach similar to our approach for the formulas for ordered and unordered samples without replacement. We shuffle our entire universe of objects, and then shuffle within each group. This is why we divide by \(2!\text{,}\) \(5!\text{,}\) and \(3!\)

This theorem is really dense, so we should try to translate it into simpler English. To do so, let's return to the simple example of choosing people to be on teams.

There are nine people to be allocated onto three teams, let's say \(A\text{,}\) \(B\text{,}\) and \(C\text{.}\) It's a little unrealistic but let's further suppose that a team can contain all nine people, or no people.

The first step is to choose how many people must go on each team. Each choice is a multiset, and the number of such choices is

\begin{equation*} {{n-1+r}\choose{r}} = \dfrac{(n-1+r)!}{r!(n-1)!} \end{equation*}

where \(n=3\) and \(r=9\text{.}\) The key thing to notice here is that we are sampling from the universe of teams, not people. Hence why \(n=3\) and not \(9\text{.}\) So, there are

\begin{equation*} {{11}\choose{9}} = 55 \end{equation*}

possible distributions of the people onto each team.

The next step is to actually choose which people go on which team. This time, we are sampling from the universe of people, so \(n=9\text{.}\) Let's suppose we've chosen the multiset \(\{A^4, B^3, C^2\}\text{,}\) i.e. the distribution where \(r_1 = 4\) people go to team \(A\text{,}\) \(r_2 = 3\) people go to team \(B\text{,}\) and \(r_3 = 2\) people go to team \(C\text{.}\) Then, the number of ways to choose which people go on which team according to this multiset is

\begin{equation*} {{9}\choose{4}} {{5}\choose{3}} {{2}\choose{2}} \end{equation*}

or, equivalently,

\begin{equation*} \frac{9!}{4!3!2!} \end{equation*}

which is \(1,260\text{.}\)

The idea is that once we have chosen one of the

\begin{equation*} {{n-1+r}\choose{r}} \end{equation*}

possible distributions, we are now choosing which of the objects to place in each category.

How many distinguishable arrangements are there of the word ARRANGEMENT?

“Distinguishable” means that we can tell the difference in the arrangements. For example, there are two R's in the word. If we call them R1 and R2, there is no difference between A(R1)(R2)ANGEMENT and A(R2)(R1)ANGEMENT.

There are \(7\) different letters in the word ARRANGEMENT. If we number the \(11\) characters in the word, then we are trying to count the number of functions

\begin{equation*} f: \{1, 2, 3, \ldots, 11\} \to \{A, R, N, G, E, M, T\} \end{equation*}

such that \(r_x\) characters are the letter \(x\text{:}\)

\begin{equation*} r_A = r_R = r_E = r_N = 2, \end{equation*}

and

\begin{equation*} r_G = r_M = r_T = 1. \end{equation*}

By the preceding theorem there are

\begin{equation*} {{11}\choose{2,2,2,1,2,1,1}}=\frac{11!}{2!2!2!1!2!1!1!}=2,494,800 \end{equation*}

total distinguishable arrangements of the word.

That explanation was perhaps a bit painstaking to make clear the viewpoint of counting functions based on a multiset, but there is a cleaner path to the answer. There are \(11!\) ways to arrange all the letters; however, we have double-counted by the \(2!\) arrangements of the A's, R's, N's, and E's, and so we divide by \(2!\) four times.

Observe that when \(k=2\text{,}\) a multinomial coefficient is exactly a binomial coefficient with \(r_1=r\) and \(r_2=n-r\text{.}\) When \(k=3\) we call the numbers trinomial coefficients and thereafter typically say multinomial coefficients.

Accordingly, theorems about binomial coefficients generalize to multinomial coefficients. For example, the trinomial coefficients exhibit a similar recurrence relation.

This theorem can be proven using algebraic manipulation or by counting ternary strings, strings from the alphabet \(\{0,1,2\}\text{.}\) You could also look at the expansion of the polynomial \((x+y+z)^n\text{.}\) Speaking of...

This theorem tells us that when we expand the polynomial \((x_1 + x_2 + \cdots + x_k)^n\text{,}\) there are \({{k-1+n}\choose{ n}}\) different terms (one for each multiset of cardinality \(n\) from \(\{x_1, x_2, \ldots, x_k\}\)—careful with the notation here because the \(n\) has switched places!), each of which is multiplied by the appropriate multinomial coefficient.

There are

\begin{equation*} {{6-1+10}\choose{10}} = 3,003 \end{equation*}

terms of the polynomial \((a+b+c+d+e+f)^{10}\text{.}\) Obviously we will not list them all. The \(a^5bc^2ef\) term has the coefficient

\begin{equation*} {{10}\choose{5,1,2,0,1,1}} = \frac{10!}{5!1!2!0!1!1!} = 15,120. \end{equation*}

Our final example is meant to illustrate the deep connection between counting problems, multisets, and multinomial coefficients, through the technique of generating functions.

One last time, we are visiting our friend for three days and want to do some activities on our trip. This time, instead of having ten activities, let's just have six. (This is because some attractions are closed and not at all to cut down on the length of our polynomial.)

Consider the polynomial \((T+F+S)^6\text{.}\) This polynomial has

\begin{equation*} {{3-1+6}\choose{6}} = 28 \end{equation*}

terms, each of which looks like

\begin{equation*} {{6}\choose{r_T, r_F, r_S}} T^{r_T}F^{r_F}S^{r_S}. \end{equation*}

The full polynomial is

\begin{align*} (T+F+S)^6 \amp = T^6 + 6T^5F + 6T^5S + 15T^4F^2\\ \amp = 30T^4FS + 15T^4S^2 + 20T^3F^3 + 60T^3F^2S\\ \amp = 60T^3FS^2 + 20T^3S^3 + 15T^2F^4 + 60T^2F^3S\\ \amp = 90T^2F^2S^2 + 60T^2FS^3 + 15T^2S^4 + 6TF^5\\ \amp = 30TF^4S + 60TF^3S^2 + 60TF^2S^3 + 30TFS^4\\ \amp = 6TS^5 + F^6 + 6F^5S + 15F^4S^2\\ \amp = 20F^3S^3 + 15F^2S^4 + 6FS^5 + S^6 \end{align*}

If you have the time, you should try to expand the polynomial yourself. Critically, I do not mean that you should multiply \((T+F+S)^6\) and distribute. I mean that you should write down all \(28\) possible terms of the polynomial, and calculate the appropriate multinomial coefficient for each one. If you do not have that kind of time, then you should pick a few different terms and make sure you can calculate the multinomial coefficient for each one.

Mathematics is a human endeavor. Nothing about it is supernatural. However, some pieces of the puzzle fit together so mind-bogglingly well that we are tempted to use the word “miraculous” to describe them, and generating functions are one such result.

Suppose that you would like to do two activities on Thursday, four on Friday, and relax on Saturday. Notice that this choice of distribution corresponds to the term \(T^2F^4\) of the above polynomial. How many ways are there to pick which activities to do on each day? We need only to read the term's coefficient \(15\) to find out!

In this way, the polynomial \((T+F+S)^6\) holds the key to all of the distributions of six activities into three days, and the coefficients tell us how many ways there are to choose the activities for each distribution!

Of course, there is nothing special about this example. A generating function is a function which, when expanded into a polynomial or a “Taylor series” (students who haven't had Calculus II should stop at the word “polynomial” there), has coefficients that are exactly the terms of some sequence of interest.