Suppose you have a set $S$ of $N$ numbers:
$$\displaystyle S = \{x_1, x_2, x_3, \cdots, x_N\}$$
How many different products can be formed by multiplying at most $F$ members of $S$, allowing repetitions? For example, take $S = \{a,b,c,d\}$. Then the distinct products are:
Number of factors | Products |
1 | $$\displaystyle a, b, c, d$$ |
2 | $$\displaystyle \begin{aligned} &a^2, ab, ac, ad, \\ &b^2, bc, bd, \\ &c^2, cd, \\ &d^2 \end{aligned}$$ |
for a total of 14 different products. How can we find this number without listing out every possible combination? It turns out there is a hard way to do this and an easy way. The hard way that I first worked through (I’m sure there are other hard ways) involves two layers of induction and lots of bookkeeping. The easy way is by a combinatorial argument, which will be the rest of this post.
If we did not allow repetitions (like $x_1^2$ or $x_2^3 x_4$) the question would be rather easy. The number of products $P$ would just be
$$\displaystyle P = \sum_{k=1}^F \binom{N}{k} $$
So our job will be done once we account for repeated factors. One way to do this is to augment the set $S$ of numbers to be multiplied with some wildcard elements $r_i$ that will indicate when a factor gets repeated. Call this new set $T$:
$$\displaystyle T = \{ x_1, x_2, x_3, \cdots, x_N, r_1, r_2, r_3, \cdots, r_F \} $$
$T$ has $N+F$ elements. With the right understanding of the wildcards $r_i$, we can form all the products we want to count by picking subsets of $T$ of size $F$.
Any one of these subsets will have all elements distinct and will consist of elements of the original $S$ (the $x_i$) and some wildcards. Each such subset $s$ corresponds to a product determined by the following process:
- Set $i=1$
- The $i$-th factor is $r_{i-1}$ if $r_{i-1} \in s$, otherwise it is the $x_j \in s$ with the smallest unused subscript $j$.
- Replace $r_{i-1}$ with the $(i-1)$-th factor, which has already been found. In effect, the $r_{i-1}$ repeats the $(i-1)$-th factor.
- Increment $i$ and go to step 2 if $i \le F$.
Example: Let $s= \{ x_2, x_4, x_5, r_1, r_3 \}$. The table below shows the buildup of the corresponding product:
$i$ | Cumulative Product | Reason |
1 | $x_2$ | This is the unused $x_j$ with the smallest $j$ |
2 | $x_2 r_1$ | $r_{i-1}$ is in $s$ |
2 | $x_2 x_2$ | $r_1$ repeats the 1st factor |
3 | $x_2 x_2 x_4$ | $x_4$ has the next smallest subscript |
4 | $x_2 x_2 x_4 r_3$ | $r_{i-1}$ is in $s$ |
4 | $x_2 x_2 x_4 x_4$ | $r_3$ repeats the 3rd factor |
5 | $x_2 x_2 x_4 x_4 x_5$ | $x_5$ has the next smallest subscript |
Using this process, every subset of $T$ is matched to exactly one product of $x_j$’s with repetitions accounted for. With some thought, we can show that each possible product corresponds to exactly one subset.
Each product is completely defined by its distinct factors and the number of repetitions of each. Obviously the distinct factors must be the $x_j$ that comprise the non-wildcard part of the subset. Each repetition in the product can then only be achieved by a specific $r_i$ – the one whose subscript matches the index of the most recent instance of the factor being repeated.
Example: Find the subset for the product $x_1^3 x_2 x_4^2$. The distinct factors are $\{ x_1, x_2, x_4 \}$, of which $x_1$ would be used first in the buildup process. The first repetition of $x_1$ must come from an $r_1$ in the subset. The second could come from another $r_1$, but that is not possible since at most one $r_1$ could be in the subset and it has already been used. So the second repetition of $x_1$ must come from repeating the second factor (which is the result of the first repetition of $x_1$), i.e. from an $r_2$ in the subset. That takes care of the $x_1^3$ part. $r_3$ can’t be in the subset because that would create an $x_1^4$ which is not in the product.
Then the lone $x_2$ is processed, becoming the 4th factor and the 4th member of the subset. $r_4$ can’t be involved because it would create an $x_2^2$ which is not in the product.
The next subset member is $x_4$, which is the 5th factor so its repetition must be caused by an $r_5$ in the subset. Now all the factors in the original product have been translated to subset elements, with the final subset being $\{ x_1, x_2, x_4, r_1, r_2, r_5 \}$
So there is a one-to-one correspondence between subsets of $T$ with size $F$ and the products we want to count. Actually, there is one exception. Since there are $F$ wildcards it is possible to pick a subset consisting only of wildcards. There is only one such subset, and following the buildup process it would give you a product with no factors. Obviously we are not interested in computing a product of nothing, so we will have to subtract 1 from the number of size-$F$ subsets. With that, the formula for the number $P(N,F)$ of products of at most $F$ factors from a set of $N$ elements is:
$$\displaystyle P(N,F) = \binom{N+F}{F} – 1$$
The example at the top of this post had $N = 4$ possible factors and used products of size at most $F=2$, which when plugged in give:
$$\displaystyle P(4,2) = \binom{6}{2}-1 = \frac{6\cdot 5}{2\cdot 1} – 1 = 15-1 = 14$$
Which matches the count from the list. Victory!