Riddler Card Collecting

Simulated Solution

Another way to solve this problem is to model the card collecting process and count the number of steps required to complete the set. However, because the process is random, every time we model it we will get a different answer. On the other hand, if we model the process a large number of times, say one million, then we would expect the average result to converge to the true result.

Here is the short python function that generates random trials. We run the function with a large number of trials to observe the distribution of results.

import numpy as np
from random import randint

def model(trials, n_cards=144, price=5):
"""
Solve the riddler express via simulation: collecting cards, and
counting the number of steps required to collect the entire set
"""
for _ in range(trials):
i, collected = 0, set()
while len(collected) < n_cards:
i += 1
yield i * price

trials = 1000000
results = np.array(list(model(trials)))

# print the average result
print(results.mean())
# 3995.78149


Good news! Our simulated result is very close to our analytical result above, at roughly \$3996. The simulation also gives us information about the full distribution of results as well. For example, while it is theortically possible to collect 144 unique cards by drawing 144 unique cards in a row, the odds are extremely slim. What was the lowest cost from our simulation? What was the highest? This can all be seen by analyzing the results directly, or visualized as a density plot, which shows how likely it is that we would pay a certain amount to collect the full set of cards.

>>> results.min()
1795

>>> results.max()
13725


You'll notice in the chart that the most likely amount we'll pay to assemble the entire set is around 3500. However, the average amount we expect to pay is around 4000. Why are the numbers different? The average amount includes those tail scenarios where it takes us nearly 10,000 to collect the entire set, which moves it higher than the single most likely amount we would expect to pay. This is part of the reason why seeing the entire distribution, rather than just a single metric like the average, can be helpful in understanding the problem.