Feb 28, 2020
This week's Riddler asks us to estimate how long it takes to get a haircut with our favorite barber. Rather than scheduling an appointment, we roll the dice and hope we won't have to wait too long if we drop by unannounced. How long exactly? We'll use probability distributions and monte carlo simulation to estimate our idle time.
At your local barbershop, there are always four barbers working simultaneously. Each haircut takes exactly 15 minutes, and there’s almost always one or more customers waiting their turn on a first-come, first-served basis.
Being a regular, you prefer to get your hair cut by the owner, Tiffany. If one of the other three chairs opens up, and it’s your turn, you’ll say, “No thanks, I’m waiting for Tiffany.” The person behind you in line will then be offered the open chair, and you’ll remain at the front of the line until Tiffany is available.
Unfortunately, you’re not alone in requesting Tiffany — a quarter of the other customers will hold out for Tiffany, while no one will hold out for any of the other barbers.
One Friday morning, you arrive at the barber shop to see that all four barbers are cutting hair, and there is one customer waiting. You have no idea how far along any of the barbers is in their haircuts, and you don’t know whether or not the customer in line will hold out for Tiffany.
What is the expected wait time for getting a haircut from Tiffany?
Our average wait time is just a bit longer than 14 minutes.
I solved this problem in two ways: first by running a simulation, then by writing the probability density function analytically. Fortunately, my two results converged, and they both resulted in the final value of 14.0625. The simulation code is highly efficient, allowing you to run tens of millions of scenarios nearly instantly.
10_000_000 results = model(trials) print(results) 14.06276349572674trials =
The analytical work, on the other hand, was much more complicated, and, forgive me, but I'll leave my scrap work out of this week's writeup. Instead, I'll show the three components of the problem in separate charts and how they add up to a legitimate probability density function in the image below. Indeed we can use these density functions to calculate the exact value, and it was fun to see the progression of each of the three cases build up to the whole. Those three cases are:
In any case, when we sum each of the possibilities together, we get the chart on the far right, which lets us calculate the probability of being seated within some range of times, such as "between 10 and 15 minutes" or "between 10:00 and 10:01 minutes".
Simulation code is below. It's fast! You can run 10 million scenarios in a snap of the fingers.
import numpy as np def model(trials: int, barbers: int = 4, haircut_time: int = 15) -> np.ndarray: """ Simulate a number of haircut scenarios and return an array of results, which are the times it takes for your haircut with tiffany to begin. This is a vectorized simulation, so calculating ten million or more scenarios is nearly instantaneous. """ # simulate the time it would take each barber to finish their haircut # we assume tiffany is listed first in the array, but it's arbitrary haircuts = np.random.uniform(high=haircut_time, size=(trials, barbers)) # determine whether or not tiffany finishes cutting hair first tiffany_first = haircuts.argmin(axis=1) == 0 # simulate an array representing whether or not the customer ahead of # us prefers tiffany, which happens 25% of the time prefers_tiffany = np.random.uniform(size=trials) < 0.25 # create an array to store the results, then go through the cases results = np.zeros(shape=trials) # case 1: customer ahead of us doesn't prefer tiffany AND she doesn't # finish first, so we simply wait for her to finish before our turn idx = (~tiffany_first) & (~prefers_tiffany) results[idx] = haircuts[idx, 0] # case 2: all other cases require us to wait for tiffany to finish # her existing haircut, then another 15 minutes, either because the # customer ahead prefers her, or because she finished first results[~idx] = haircuts[~idx, 0] + haircut_time return results if __name__ == "__main__" results = model(10_000_000) print(results.mean())
And lastly, the code to generate the probability density functions and plot, which were coded somewhat manually based on my calculations.
from matplotlib import pyplot as plt import numpy as np def a(x): """ Customer ahead of us prefers Tiffany, so we wait for her to finish her current cut plus one more before our turn. """ out = np.full_like(x, 1 / 15) out[x < 15] = 0.0 return out def b(x): """ Tiffany finishes first, so we have to wait for her finish time plus the full time for another haircut ahead of us """ out = 4 / 15 ** 4 * (15 - (x - 15)) ** 3 out[x < 15] = 0.0 return out def c(x): """ Tiffany doesn't finish first, and the customer ahead of us didn't prefer her, so we can be served as soon as she's done. """ out = 4 / 45 * (1 - ((15 - x) / 15) ** 3) out[x >= 15] = 0.0 return out def d(x): """Sum the individual components by weight""" return a(x) * 0.25 + b(x) * 0.1875 + c(x) * 0.5625 fig, ax = plt.subplots(ncols=4, sharex=True, sharey=True, figsize=(10, 2)) plt.tight_layout(pad=-1.5) ax.set_title("Case A: 25%\nCustomer prefers Tiffany", loc="left") ax.set_title("Case B: 18.75%\nNo pref., Tiffany available", loc="left") ax.set_title("Case B: 56.25%%\nNo pref., Another available", loc="left") ax.set_title("Total: A + B + C\n", loc="left") x = np.linspace(0, 30, 10001) ax.plot(x, a(x) * 0.25, c='0.2') ax.plot(x, b(x) * 0.1875, c='0.2') ax.plot(x, c(x) * 0.5625, c='0.2') ax.plot(x, d(x), c='0.2') ax.fill_between(x, a(x) * 0.25, 0, facecolor="0.95") ax.fill_between(x, b(x) * 0.1875, 0, facecolor="0.95") ax.fill_between(x, c(x) * 0.5625, 0, facecolor="0.95") ax.fill_between(x, d(x), 0, facecolor="0.95") for i in range(4): ax[i].set_ylim(0, 0.07) for s in ["top", "right"]: ax[i].spines[s].set_visible(False) plt.show()