Sumo Logic ahead of the packRead article
Complete visibility for DevSecOps
Reduce downtime and move from reactive to proactive monitoring.
It’s that time of year again! Time for Secret Santa. After all, what shows off your holiday spirit better than exchanging gifts in August? As you attempt to organize your friends into a Secret Santa pool, though, I wonder if you appreciate the beautiful math going on in the background.
For those of you unfamiliar with Secret Santa, here’s the basic idea. A group of friends write their names on slips of paper and drop them into a hat. Once everyone’s name is in, each person blindly draws out a name from the hat. These slips of paper indicate whose Secret Santa each person is. For the sake of simplicity, let us assume that if a person draws their own name, they are their own Secret Santa.
As an example, consider a group of three friends: Alice, Bob, and Carol. Alice draws Bob’s name out of the hat. Bob draws Alice’s name out of the hat. Carol draws her own name out of the hat. In this example, Alice will give Bob a gift; Bob will give Alice a gift; and Carol will give herself a gift.
Here comes the math.In the example previously described, I would argue that there are two “loops” of people. A loop can be defined as an ordered list of names such that each person gives a gift to the next person in the list except for the last person, who gives to the first person in the list. Below we see a graphical interpretation of the example that clearly shows two loops. Alice and Bob are one loop while Carol is her own loop.
We could equally well display this information by using a list. Alice gives a gift to the first person in the list, Bob gives to the second person, and Carol gives to the third person. Thus we can describe the graph above by writing [B, A, C].
One can easily imagine a different arrangement of gift-giving resulting in different number of loops, however. For example, if Alice drew Bob’s name, Bob drew Carol’s name, and Carol drew Alice’s name, there would only be one loop. If Alice drew her own name, Bob his own name, and Carol her own name, there would be three loops.
[B, C, A] [A, B, C]
In these diagrams, each node is a person and each edge describes giving a gift. Note that each person has exactly one incoming and one outgoing edge since everybody receives and gives one gift. Below each diagram is the corresponding list representation.
The question that had been keeping me up at night recently is as follows: for a group of x people participating in Secret Santa, what is the average number of loops one can expect to see after everyone has drawn names from the hat? After I started touting my discovery of a revolutionary graph theory problem to my friends, they soon informed me that I was merely studying the fairly well known problem of what is the expected number of cycles in a random permutation. Somewhat deflated but determined to research the problem for myself, I pressed on.
To get a rough estimate of the answer, I first simulated the game on my computer. I ran 100 trials for x ranging from 1 to 100 and calculated the number of loops for each trial. I plotted the results and noticed that the resulting curve looked a lot like a log curve. Here’s the graph with a best-fit log line on top.
The jitters in the curve no doubt come from not sampling enough simulated trials. Even with that noise, though, what is truly remarkable is that the expected number of loops is nearly exactly equal to the natural log of how many people participate.These results gave me insights into the problem, but they still didn’t give a completely satisfactory answer. For very small x, for example, ln(x) is a terrible approximation for the number of loops. If x=1, the expected number of loops is necessarily 1 but my log-based model says I should expect 0 loops. Furthermore, intuitively it seems like calculating the number of loops should be a discrete process rather than plugging into a continuous function. Finally, I still didn’t even know for sure that my model was correct. I resolved to analytically prove the exact formula for loops. Let f(x) represent the average number of loops expected if x people participate in Secret Santa. I decided to work off the hypothesis that f(x)=1+12+13+...+1x (also known as the xth harmonic number). This equation works for small numbers and asymptotically approaches ln(x) for large x.
Since I already know f(x) is correct for small x, the natural way to try to prove my result generally is through a proof by induction.Base Case:
The average number of loops for a single person in Secret Santa is 1.
The base case works.
Prove that f(x+1)=1+12+13+...+1x+1x+1
The key insight into this proof is the first line of the inductive step. Here’s one way to think about it if by using our list representation described earlier:There are two cases one needs to consider. 1) The last element that we place into the x+1spot in the list has value x+1 . This means the first x spots contain all the numbers from 1 to x . The odds of this happening are 1x+1 . Crucially, we get to assume that the average number of loops from the first x elements is therefore f(x) . Adding the last element adds exactly one loop: player x+1 giving himself a gift. 2) The last element that we place into the x+1 spot in the list does not have value x+1 . This covers all the other cases (the odds of this happening are xx+1 ). In this scenario, one of the first x people points to x+1and x+1points to one of the first x people. In essence the x+1th person is merely extending a loop already determined by the first x people. Therefore the number of loops is just f(x).
If we assume a uniform distribution of permutations (as assumption that is easily violated if players have choice ) and we weight these two cases by the probability each of them happening, we get the total expected number of loops for f(x+1).
Just like that, we have proved something theoretically beautiful that also applies to something as mundane as a gift exchange game. It all started by simulating a real-world event, looking at the data, and then switching back into analytical mode.
As I mentioned above, my “research” was by no means novel. For further reading on this topic, feel free to consult this nice summary by John Canny about random permutations or, of course, the Wikipedia article about it.
Since starting to write this article, a colleague of mine has emailed me saying that someone else has even thought of this problem in the Secret Santa context and posted his insights here.
Reduce downtime and move from reactive to proactive monitoring.
Build, run, and secure modern applications and cloud infrastructures.Start free trial