rangiferโs diary: pt. xci
Whatโs our level range, again?
If youโve ever done โpoints farmingโ or BRPQ (โBoss Rush Party Questโ; โBPQโ) during a MapleLegends event, then you already know how party recruitment goes. For farming event points, youโre looking for fellow HHGII-goers (or PP1-goers, or what have you) that are within 20 levels of you. In this context, โ๐1 is within ๐ levels of ๐2โ means that the absolute difference between the levels of PCs ๐1 and ๐1 is less than or equal to ๐. Symbolically, weโd write this as |โ1ย โย โ2|ย โคย ๐, where โ๐ is ๐๐โs level. For event point farming, ๐ย =ย 20; for BRPQ, ๐ย =ย 30.
Tolerance
Youโll notice that โ๐1 is within ๐ levels of ๐2โ sounds exactly like some kind of binary relation. More specifically, for a given value of ๐, this is a homogeneous relation over ๐ถ, where ๐ถ is the set of all PCs (๐๐ย โย ๐ถ). This homogeneous relation โ which weโll just call ๐ ๐ โ has some nice properties:
- ๐ ๐ is symmetric, because the absolute difference commutes.
- ๐ ๐ is reflexive for any ๐ย โฅย 0, because the level difference between a PC and themselves is always 0. We donโt care about the ๐ย <ย 0 case, because itโs trivial โ itโs just the empty relation.
However, it also lacks some properties that weโre familiar with having in many binary relations:
- ๐ ๐ is not transitive. This is really important for various reasons (one reason being that this makes ๐ ๐ not an equivalence relation), and is obvious if you just think about it for a second: if ๐1 is level 30, ๐2 is level 45, and ๐3 is level 60, then ๐1โก๐ 20โก๐2 and ๐2โก๐ 20โก๐3, but not ๐1โก๐ 20โก๐3 (because 30ย >ย 20).
- ๐ ๐ is not connected. In particular, if the absolute difference between the levels of ๐1 and ๐2 is in excess of ๐, then ๐1 and ๐2 are โincomparableโ according to this relation (viz. neither ๐1โก๐ ๐โก๐2 nor ๐2โก๐ ๐โก๐1 hold).
Like I said, ๐ ๐ canโt be an equivalence relation because it lacks transitivity. However, I found out that thereโs actually a special name for this kind of relation: a tolerance relation. The name makes sense; we essentially have some level of โtoleranceโ (which in our particular case is called ๐) for how โdistantโ objects may be from one another before we consider them to be โdiscernibleโ from one another. Equivalence relations (or rather, congruence relations) are the special case of tolerance relations where the โtoleranceโ is โzeroโ, or is otherwise nonexistent in some sense; then, itโs not possible for โtolerancesโ to โaccumulateโ (as they do in the โ๐1โก๐ 20โก๐3โ example above) and cause intransitivity. In the language of tolerance relations, when ๐1โก๐ ๐โก๐2 holds, we say that โ๐1 is indiscernible from ๐2โ. This makes (๐ถ,ย ๐ ๐) a tolerance space.
Furthermore, a tolerance relation over a finite set is called a dependency relation. Dependency relations are important in concurrent computation, where they are so called because they relate two objects iff those objects depend on one another in some way, thus precluding them from being computed concurrently. For example, if you want to cook some food and then eat it, then there is a dependency between the โcookingโ part and the โeatingโ part; you cannot do them concurrently โ unless you actually wanted to eat the uncooked food, in which case, why even bother cooking in the first place?
Aggregate tolerance
In our particular case of recruiting for a party (whether it be for farming event points, or for BRPQing, or for something similar), we ideally want โ and, in the case of BRPQ, absolutely need โ this tolerance relation ๐ ๐ to hold pairwise for all members of the party. Symbolically, if our party is ๐ (s.t. ๐ย โย ๐ถ), then we want:
โ(๐1,ย ๐2)ย โย ๐2 : ๐1โก๐ ๐โก๐2.
With this in mind, thereโs a particular phenomenon that we quickly notice: as the partyโs size (i.e. its number of members) grows, we see a shrinking of the range of levels that a new member could possibly be. For example, if we represent our party as a multiset of levels, and we let ๐ย โย 20, then we might start with e.g. {100}, and the range of levels that a new member could possibly be is [80,ย 120]. If we select some arbitrary level from this interval, letโs say e.g. 110, then our party is {100,ย 110}, and the valid level interval for a new member is now [90,ย 120], which is a strict subset of the original interval. In general, adding a new party member does not necessarily change the valid level interval, so the new interval is always a subinterval (not necessarily strict) of the old one. This means that the length of the interval decreases weakly monotonically as the party size grows.
In particular, the valid level interval for a new party member, as a function of the multiset of levels of the current party members (call this multiset ๐ฟ), is:
โ๐โก(๐ฟ) = [๐๐บ๐โก(๐ฟ)ย โย ๐,ย ๐๐๐โก(๐ฟ)ย +ย ๐].
Note that an interval like [1,ย 0] โ where the lower bound is strictly greater than the upper bound โ is the empty set.
As a result, the interval only changes if the new member (weโre not going to consider the possibility of members leaving the party or being kicked out) changes the maximum level or minimum level present in the party. In other words, the interval only changes if the new memberโs level is either strictly greater than, or strictly less than, all old memberโs levels.
Interval/bound convergence
I noted that the length of the interval decreases weakly monotonically as the party size grows, and in fact, itโs clear that it can โconvergeโ to an absolutely minimal length of ๐. This occurs when ๐๐บ๐โก(๐ฟ)ย โย ๐๐๐โก(๐ฟ)ย =ย ๐, and thus โ๐โก(๐ฟ)ย =ย [๐๐๐โก(๐ฟ),ย ๐๐บ๐โก(๐ฟ)]. At exactly this point, it no longer matters how many new members join, nor what levels those new members are โ the interval will never change.
We donโt typically have any way to accurately predict the level of the next PC who will ask to join the party โ weโre usually just recruiting whichever random people happen to show up at that time. Thus, we have no good choice but to model the level of a new party member as effectively a random variable. In doing so, assuming that we assign a nonzero probability to each level[1], we can intuitively[2] see that |โ๐| almost surely converges to ๐, as the party size tends to +โ.
Given that we expect to converge in this way after a sufficient number of new members join our party, the question that I want to ask seems simple: how long does it take to converge?
In order to answer this question, we need to define things in a way that is a bit more clear-cut. When we say โhow longโ, we really mean โhow many new party membersโ; each time that a new party member is added, I call this a single โstepโ of the process. Furthermore, as it is, thereโs actually no guarantee that we converge at all, because weโre still talking about MapleStory parties, which are limited to a maximum cardinality of 6. So we want to lift that restriction, allowing arbitrarily large (but still always finite) parties. Similarly, we want to lift the level restrictions that MapleStory inherently possesses, because it makes things more asymmetrical and more difficult to reason about; if ๐ย =ย 20 and ๐ฟย =ย {190}, then the formula above suggests that โ๐โก(๐ฟ)ย =ย [170,ย 210], but in practice itโs actually [170,ย 200], because the maximum level in MapleStory is 200. So instead, we will assume that any element of โค is a valid level for a PC to have, and anything else (e.g. 13.5, ฯ, โฃ, etc.โฆ) is invalid. Also similarly, we want to constrain ๐ so that ๐ย โย โ.[3]
Another part that we have to specify is how PC levels are distributed. Weโve already accepted that the level of a new party member is random, but weโve yet to specify in what way. By invoking the principle of indifference, we can just assume that PC levels are distributed uniformly. Obviously, in practice, the โactualโ (empirical) distribution is not strictly uniform; for example, level 120 characters are probably more common than level 119 characters in MapleLegends. But trying to get a better approximation is almost certainly more trouble than itโs worth, and much more importantly, we donโt actually care; although this problem is modelled on empirical player-driven phenomena that occur in (a particular implementation of) MapleStory, it is really just a mathematical problem that we have in mind.
WLOG (weโll get a better idea of why this doesnโt lose generality in just a bit), we can assume for simplicity that our initial value of ๐ฟ is {0}. Then, given some value of ๐, we want to ask: what is the probability that we converge after ๐ steps (๐ย โย โ)? What weโre asking for here is a probability mass function (PMF) that is parameterised by ๐. Because this PMF more or less defines[4] the probability distribution that weโre after, Iโm going to refer to the PMF as simply ๐ฑ๐ก๐ขโก(๐), where โ๐ฑ๐ก๐ขโ stands for โrecruitment bound convergenceโ. Then, ๐ฑ๐ก๐ขโก(๐)โก(๐) is the probability that, given some value for ๐, we converge in exactly ๐ steps.
Why do we care about ๐ฑ๐ก๐ขโก(๐)? Iunno. I just thought of it one time when I was farming event points or something. Sue me. The point is that I spent absolutely way too long obsessing over it, and it is now that we will see what I have uncovered about this mysterious probability distribution. Iโm going to go over what I learned in roughly the order that I learned it, and with roughly the methods by which I learned it.
Footnotes for โInterval/bound convergenceโ
- [โ] At least, to every level that is valid for a PC to be, which in MapleStory is usually [1,ย 200]ย โฉย โค. Of course, merely assigning a nonzero probability to each level in โ๐ does the trick, as well.
- [โ] By waving our hooves and invoking the infinite monkey theorem, or somethingโฆ
- [โ] In general, we could loosen the constraints on ๐ so that it merely has to be a non-negative real number. However, if we do this, then โ๐โ and ๐ are effectively the same, so it will make things a lot easier to just assume by convention that ๐ is a natural number.
- [โ] See also: โProbability spaceโ.
Phase I: Worked examples
Because this whole thing was a little too abstract, I decided to first make it concrete by working out, by hoof, a particular example.
Consider ๐ย โย 4. Whatโs the probability that we converge in exactly 1 step? In other words, what is the value of ๐ฑ๐ก๐ขโก(4)โก(1)? Well, we converge in exactly 1 step iff the first PC to join the party is level 4 or is level โ4. We can see that โ4โก{0}ย =ย [โ4,ย 4], so the set of possible levels that this first step could add to the party is the intersection of this interval with โค, viz. {โ4, โ3, โ2, โ1, 0, 1, 2, 3, 4}. The cardinality of this set is 9, and there are 2 possible values in the set that cause us to converge (viz. {โ4,ย 4}), so ๐ฑ๐ก๐ขโก(4)โก(1)ย =ย 2โ9ย =ย 0.222โฆ. Easy, right?
Okay, but what about ๐ฑ๐ก๐ขโก(4)โก(2)? Well, we can take advantage of the symmetry of ๐ ๐ to observe that, in the first step, adding a level โ PC to the party is the same as adding a level โโ PC to the party. More generally, instead of laboriously tracking the exact state of ๐ฟ at each step, we can note that we really only care about |โ4โก(๐ฟ)|, which itself only depends on ๐๐๐โก(๐ฟ) and ๐๐บ๐โก(๐ฟ). So, because levels are distributed uniformly (meaning that ๐ฟ is effectively invariant under constant shifts[1], e.g. adding +17 to all levels), we can simply pretend that ๐ฟย =ย {0,ย ๐๐บ๐โก(๐ฟ)ย โย ๐๐๐โก(๐ฟ)}. For example, weโd simplify {0,ย 1,ย 3} to {0,ย 3}, and weโd simplify {โ2,ย โ2,ย โ1,ย 1} to {0,ย 3} as well. These observations & simplifications will make it significantly easier to compute values of ๐ฑ๐ก๐ข by hoof, but will also contribute to a more systematic (read: not by hoof) treatment of ๐ฑ๐ก๐ข. Working through the calculation of ๐ฑ๐ก๐ขโก(4)โก(2), we end up with something like:
we add | ๐ฟ | valid new levels | we add | probability |
---|---|---|---|---|
โ3,ย 3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | โ1,ย 4 | 2โ9ย โ ย 2โ6 = 2โ27 |
โ2,ย 2 | {0,ย 2} | โ2, โ1, 0, 1, 2, 3, 4 | โ2,ย 4 | 2โ9ย โ ย 2โ7 = 4โ63 |
โ1,ย 1 | {0,ย 1} | โ3, โ2, โ1, 0, 1, 2, 3, 4 | โ3,ย 4 | 2โ9ย โ ย 2โ8 = 1โ18 |
0 | {0,ย 0} | โ4, โ3, โ2, โ1, 0, 1, 2, 3, 4 | โ4,ย 4 | 1โ9ย โ ย 2โ9 = 2โ81 |
These 4 cases make up all possible cases where we converge in exactly 2 steps: with our simplifications, there are only 4 possible levels that we can effectively add in the first step without converging (we canโt converge at that point because then it would be converging in exactly 1 step instead of 2), and then we always have to choose a level that causes us to converge in the second step, of course. Notice that no matter how many steps we converge in, there are always exactly 2 possible levels that can be added to ๐ฟ in the final step to make it converge: either the level thatโs exactly high enough to converge, or thatโs exactly low enough. Because these 4 causes are exhaustive, we can simply add their probabilities to get the value of ๐ฑ๐ก๐ขโก(4)โก(2) = 2โ27ย +ย 4โ63ย +ย 1โ18ย +ย 2โ81 = 247โ1134 โ 0.217โฏ813.
That was a slight bit of a pain in the ass, but donโt worry โ it gets way more painful from here. What about ๐ฑ๐ก๐ขโก(4)โก(3)? Prepare to be sorry that you asked:
we add | ๐ฟ | valid new levels | we add | ๐ฟ | valid new levels | we add | probability |
---|---|---|---|---|---|---|---|
0 | {0,ย 0} | โ4, โ3, โ2, โ1, 0, 1, 2, 3, 4 | 0โโโโโโ | {0,ย 0} | โ4, โ3, โ2, โ1, 0, 1, 2, 3, 4 | โ4,ย 4 | 1โ9ย โ ย 1โ9ย โ ย 2โ9 = 2โ729 |
โ1โโโ1โโโโ | {0,ย 1} | โ3, โ2, โ1, 0, 1, 2, 3, 4 | โ3,ย 4 | 1โ9ย โ ย 2โ9ย โ ย 2โ8 = 1โ162 | |||
โ2โโโโโโโโ2โโ | {0,ย 2} | โ2, โ1, 0, 1, 2, 3, 4 | โ2,ย 4 | 1โ9ย โ ย 2โ9ย โ ย 2โ7 = 4โ567 | |||
โ3โโโโโโโโโโโโโ3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | โ1,ย 4 | 1โ9ย โ ย 2โ9ย โ ย 2โ6 = 2โ243 | |||
โ1,ย 1 | {0,ย 1} | โ3, โ2, โ1, 0, 1, 2, 3, 4 | 0โ1โโโโ | {0,ย 1} | โ3, โ2, โ1, 0, 1, 2, 3, 4 | โ3,ย 4 | 2โ9ย โ ย 2โ8ย โ ย 2โ8 = 1โ72 |
โ1โโโโโ2โโ | {0,ย 2} | โ2, โ1, 0, 1, 2, 3, 4 | โ2,ย 4 | 2โ9ย โ ย 2โ8ย โ ย 2โ7 = 1โ63 | |||
โ2โโโโโโโโโโ3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | โ1,ย 4 | 2โ9ย โ ย 2โ8ย โ ย 2โ6 = 1โ54 | |||
โ2,ย 2 | {0,ย 2} | โ2, โ1, 0, 1, 2, 3, 4 | 0โ1โ2โโ | {0,ย 2} | โ2, โ1, 0, 1, 2, 3, 4 | โ2,ย 4 | 2โ9ย โ ย 3โ7ย โ ย 2โ7 = 4โ147 |
โ1โโโโโโโ3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | โ1,ย 4 | 2โ9ย โ ย 2โ7ย โ ย 2โ6 = 4โ189 | |||
โ3,ย 3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | 0โ1โ2โ3 | {0,ย 3} | โ1, 0, 1, 2, 3, 4 | โ1,ย 4 | 2โ9ย โ ย 4โ6ย โ ย 2โ6 = 4โ81 |
Fun!!! Adding up the probabilities of these (exhaustive) cases gets us ๐ฑ๐ก๐ขโก(4)โก(3) = 2โ729 + 1โ162 + 4โ567 + 2โ243 + 1โ72 + 1โ63 + 1โ54 + 4โ147 + 4โ189 + 4โ81 = 48649โ285768 โ 0.170โฏ239โฏ5.
I think thatโs about enough of thatโฆ
Footnotes for โPhase I: Worked examplesโ
- [โ] More formally, the elements of ๐ฟ are points in an affine space. This affinity is naturally induced by the fact that ๐ ๐ is defined in terms of subtraction between these โpointsโ.
Phase II: Whatโs the probability that we converge in exactly ๐ steps?
Now that weโve worked a good way through an example, weโre going to try to make the leap to a generalisation. I did a lot of scribbling nonsense, and a lot of thinking really hard about it, and Iโm going to try to give a brief explanation of my reasoning.
First of all, weโre going to define a new variable called ๐, which represents the difference (you can also think of it as standing for โdistanceโ or โdiameterโ) between ๐๐บ๐โก(๐ฟ) and ๐๐๐โก(๐ฟ). In other words, the simplification of ๐ฟ that I talked about above can be defined as simply {0,ย ๐}.
As I was trying to come up with a general solution, I started scribbling out some pseudocode, and I ended up with something like this to capture the general structure of the example used in Phase I โ in particular, ๐ฑ๐ก๐ขโก(4)โก(3):
sum a โ [-3..3]:
d_a := |a|
sum b โ [(d_a - 3)..3]:
d_b := max{d_a - b, b, d_a}
sum _ โ [0, 1]:
1/9 * 1/(9 - d_a) * 1/(9 - d_b)
Iโm using the sum a โ [-3..3]:
notation to mean that weโre summing over a
, with the values of a
being taken from the list [-3..3]
(which, in Haskell notation, is the same as the list [-3, -2, -1, 0, 1, 2, 3]
), and the summands simply being the value(s) that the block takes on (which may themselves be summationsโฆ). Iโm also using :=
for assignment.
As you can see, Iโm essentially just summing over all possible cases, i.e. every possible level that the first added party member could have, combined with every possible level that the second added party member could have, and so on. In the first two sum
loops, we are avoiding convergence and considering every remaining possibility, and in the third and final sum
loop, we just consider the 2 cases that cause convergence โ which is why there is no summation variable, and instead just a blank _
โ Iโve used the list [0, 1]
, but any list of length 2 would work equally well.
The pattern here is hopefully clear: if we wanted to consider larger values of ๐ like ๐ฑ๐ก๐ขโก(4)โก(4), ๐ฑ๐ก๐ขโก(4)โก(5), โฏ, we would just add another sum
layer (sum c โ [(d_b - 3)..3]:
, โฏ) that calculates yet another value of ๐ (d_c := max{d_b - c, c, d_b}
, โฏ), and the arithmetic expression in the innermost sum
loop would look like 1/9 * 1/(9 - d_a) * 1/(9 - d_b) * 1/(9 - d_c) * โฏ
.
Of course, thereโs also the question of how I came up with โmax{d_a - b, b, d_a}
โ in the first place. I basically came up with this expression by considering two cases, which are together exhaustive: that the summation variable ๐ is negative (i.e. ๐ย <ย 0), and that ๐ is non-negative (i.e. ๐ย โฅย 0):
- If ๐ is negative, then ๐ is increased by |๐| (or in other words, decreased by ๐). This is because ๐ฟ was previously {0,ย ๐๐}, so the new minimum element of ๐ฟ is |๐| lower than it previously was, and thus ๐๐บ๐โก(๐ฟ)ย โย ๐๐๐โก(๐ฟ) is increased by |๐|.
- If ๐ is non-negative, then ๐ is increased iff ๐ย >ย ๐๐. ๐ฟ was previously {0,ย ๐๐}, so:
- If ๐ is somewhere within the interval [0,ย ๐๐], then ๐ does not change (i.e. ๐๐ย =ย ๐๐).
- If ๐ exceeds the interval [0,ย ๐๐], then ๐ is increased by the amount that ๐ exceeds said interval, namely by ๐ย โย ๐๐.
- There are no other cases, as we already assumed ๐ to be non-negative.
In other words:
- If ๐ is negative, then ๐๐ย โย ๐๐ย โย ๐.
- If ๐ is non-negative, then ๐๐ย โย ๐๐ย +ย ๐๐บ๐{๐ย โย ๐๐,ย 0}.
๐๐บ๐โing the โ๐ย โย ๐๐โ with 0 covers the โ๐ is somewhere within the interval [0,ย ๐๐]โ case above, causing ๐ to be increased by 0 (i.e. to not change) in that case. Thus, we can refine how we split up the computation of โ๐๐ย +ย ๐๐บ๐{๐ย โย ๐๐,ย 0}โ above:
- If ๐ย โฅย ๐๐, then ๐๐บ๐{๐ย โย ๐๐,ย 0}ย =ย ๐ย โย ๐๐. Thus ๐๐ย +ย ๐๐บ๐{๐ย โย ๐๐,ย 0} becomes ๐๐ย +ย (๐ย โย ๐๐), which simplifies to just ๐.
- If ๐ย <ย ๐๐, then ๐๐บ๐{๐ย โย ๐๐,ย 0}ย =ย 0. Thus ๐๐ย +ย ๐๐บ๐{๐ย โย ๐๐,ย 0} becomes simply ๐๐.
So, we now have these three exhaustive cases:
- If ๐ is negative, then ๐๐ย โย ๐๐ย โย ๐.
- If ๐ is non-negative, then:
- If ๐ย โฅย ๐๐, then ๐๐ย โย ๐.
- If ๐ย <ย ๐๐, then ๐๐ย โย ๐๐.
โฆWhich is where โ๐๐บ๐{๐๐ย โย ๐,ย ๐,ย ๐๐}โ (max{d_a - b, b, d_a}
) comes from.
This brings us to the first implementation of ๐ฑ๐ก๐ข that I came up with that, you know, at the very least didnโt require doing anything โby hoofโ. Itโs an implementation in Python, with a function called pmf0
:
from fractions import Fraction as Q
def pmf0(k, n):
if k == 0:
return Q(1) if n == 0 else Q(0)
elif n < 1:
return Q(0)
return sum(pmf_inner0(k, n, 0))
def pmf_inner0(k, n, d):
denom = 2 * k + 1 - d
if n == 1:
yield Q(2, denom)
else:
for elem in range(d - (k - 1), k):
d_prime = max(d - elem, elem, d)
for f in pmf_inner0(k, n - 1, d_prime):
yield Q(1, denom) * f
pmf0
is entirely correct, and it calculates its result with unlimited precision (using Pythonโs built-in rational number arithmetic). It is, however, basically the worst implementation โ I continued refining it gradually to make something better.
As youโd expect, pmf0
(or rather, pmf_inner0
) is recursive, in order to implement that variably-sized (viz. the size of ๐) nested stack of sum
loops that we saw in the pseudocode above. The strategy here is to implement the inner recursive bit (pmf_inner0
) as a generator that yield
s the summands one by one, and then pmf0
does the job of sum
ming them all up with the built-in sum
function.
We also have special-casing for when ๐ย =ย 0, in which case the starting state (๐ฟย โย {0}) is already converged, so the probability of converging in exactly 0 steps is 1, and the probability of converging in exactly ๐ steps for any other value of ๐ is 0. On the other hand, if ๐ย โ ย 0 and yet ๐ย <ย 1, then we cannot possibly converge (as we need at least 1 step to get from ๐ฟย โย {0} to convergence), so we always return 0 in such cases.
Phase III: Can we go faster?
So, we accomplished something nice: we now have a computer program that will calculate ๐ฑ๐ก๐ขโก(๐)โก(๐) accurately (indeed, with unlimited precision) for any values of ๐ and ๐. But I still have three serious issues with what weโve got so far:
- This implementation is ludicrously slow. Using
pmf0
is not recommended unless you just want to laugh at how slow it is. - What I was dreaming of, in my mind, when starting on this problem, was coming up with a nice-looking formula โ the kind that you might see in a textbook on, say, probability theory or something. Weโre definitely not there yet.
- I also care about some other properties of the distribution. Knowing the PMF is certainly extremely important and effectively defines the distribution, but I also want to know things like the expectation, the variance, etc.
So, letโs tackle these three issues, roughly in the order stated above.
My next iteration on the Python implementation is called pmf1
, and it looks something like this:
def pmf1(k, n):
if k == 0:
return Q(1) if n == 0 else Q(0)
elif n < 1:
return Q(0)
return pmf_inner1(k, n, 0)
def pmf_inner1(k, n, d):
denom = 2 * k + 1 - d
if n == 1:
return Q(2, denom)
else:
return sum(
Q(1, denom) * pmf_inner1(k, n - 1, max(d - i, i, d))
for i in range(d - (k - 1), k)
)
As you can see, pmf1
basically looks like pmf0
except that it calls pmf_inner1
instead of pmf_inner0
, and it doesnโt sum
the results. The main thing that distinguishes pmf_inner1
from pmf_inner0
is that it does its own sum
mation, thus being an ordinary function (which just returns a single rational number) instead of a generator. This makes things more explicit. It also possibly removes some overhead that the generator approach might have, but really I wouldnโt expect it to be much faster โ the added explicitness is whatโs important.
That brings us to the next iteration, pmf2
, which is where we see the real performance gains:
def pmf2(k, n):
if k == 0:
return Q(1) if n == 0 else Q(0)
elif n < 1:
return Q(0)
return pmf_inner2(k, n, 0)
def pmf_inner2(k, n, d):
cache = {}
def f(n, d):
if (n, d) in cache:
return cache[(n, d)]
denom = 2 * k + 1 - d
if n == 1:
return Q(2, denom)
else:
r = sum(
Q(1, denom) * f(n - 1, max(d - i, i, d))
for i in range(d - (k - 1), k)
)
cache[(n, d)] = r
return r
return f(n, d)
Thatโs right, this is a job for dynamic programming! By memoising already-computed values of f(n, d)
(k
is a constant for a given invocation of pmf_inner2
, so we donโt need to store its values) in a dictionary called cache
, we avoid most of the recursion that we would otherwise perform, thus dramatically cutting down the computational complexity of pmf2
.
Oh, but we can do even better. As it turns out, weโre actually unnecessarily duplicating quite a bit of work. We did all of that reasoning above to arrive at โ๐๐ย โย ๐๐บ๐{๐๐ย โย ๐,ย ๐,ย ๐๐}โ. However, thereโs actually a similar, but easier, way to look at it. If ๐ฟย =ย {0,ย ๐} and we havenโt yet converged, then the next level added to ๐ฟ falls into exactly one of three cases: strictly less than 0, somewhere within [0,ย ๐], or strictly greater than ๐. Crucially, the number of possible levels in the first case is exactly the same as the number of possible levels in the third and final case โ viz. ๐ย โย ๐ โ and these levels are symmetrical; we get the same result if we consider adding โ1 to ๐ฟ as if we consider adding ๐ย +ย 1, the same result if we consider adding โ2 as if we consider adding ๐ย +ย 2, and so on. And adding any level to ๐ฟ that is within [0,ย ๐] yields the same result as adding any other level thatโs also within that interval, so we can just do the computation for one particular level, and then multiply the result by the number of possible levels within this interval, viz. ๐ย +ย 1.
Just to make things more concrete with an example, consider ๐ย =ย 8 and ๐ฟย โย {0,ย 3}. The set of possible levels that could be added to ๐ฟ in the next step are as follows (this is computed as โ8โก{0,ย 3}ย โฉย โค):
{โ5, โ4, โ3, โ2, โ1, 0, 1, 2, 3, 4, 5, 6, 7, 8}.
We can partition this set into the three cases that we have in mind:
{โ5,ย โ4,ย โ3,ย โ2,ย โ1} โช {0,ย 1,ย 2,ย 3} โช {4,ย 5,ย 6,ย 7,ย 8}.
As expected, the first and last sets in this partition both have cardinality ๐ย โย ๐, and the middle set has cardinality ๐ย +ย 1. In this example, the symmetry is that โ1ย &ย 4 are paired, โ2ย &ย 5 are paired, โ3ย &ย 6 are paired, โ4ย &ย 7 are paired, and โ5ย &ย 8 are paired. By โpairedโ, I of course mean that the result (i.e. the new value of ๐) is effectively the same, no matter which element of the pair is added to ๐ฟ in the next step. So, for example, instead of computing a probability under the assumption that โ1 is added to ๐ฟ in the next step, and then doing the same for 4, and then adding those two results; we can simply do it for just 4, and then double the result. Furthermore, instead of doing separate computations for every element of {0,ย 1,ย 2,ย 3} and then summing the results, we can just do it for 0, and then multiply that result by ๐ย +ย 1.
This gets us the next iteration of our little Python program, which I simply call pmf
:
def pmf(k, n):
if k == 0:
return Q(1) if n == 0 else Q(0)
elif n < 1:
return Q(0)
return pmf_inner(k, n, 0)
def pmf_inner(k, n, d):
cache = {}
def f(n, d):
if (n, d) in cache:
return cache[(n, d)]
denom = 2 * k + 1 - d
if n == 1:
return Q(2, denom)
else:
r = sum(Q(2, denom) * f(n - 1, i) for i in range(d + 1, k)) + Q(
d + 1, denom
) * f(n - 1, d)
cache[(n, d)] = r
return r
return f(n, d)
This was about as far as I could manage to refine this function, at least whilst still getting results with unlimited precision. We can go quite a bit faster by switching from Pythonโs fractions.Fraction
to its built-in float
datatype, which simultaneously limits our precision to a fixed number of bits (viz. 64 bits), and takes advantage of the fact that modern CPUs contain specialised hardware for working with IEEE 754 double-precision (a.k.a. binary64) floating point numbers. The result is pmf_float
:
def pmf_float(k, n):
if k == 0:
return 1.0 if n == 0 else 0.0
elif n < 1:
return 0.0
return pmf_inner_float(k, n, 0)
def pmf_inner_float(k, n, d):
cache = {}
def f(n, d):
if (n, d) in cache:
return cache[(n, d)]
denom = float(2 * k + 1 - d)
if n == 1:
return 2.0 / denom
else:
r = (
2.0 * sum(f(n - 1, i) for i in range(d + 1, k))
+ float(d + 1) * f(n - 1, d)
) / denom
cache[(n, d)] = r
return r
return f(n, d)
Note that this implementation sticks to integer arithmetic wherever possible, only resorting to floating-point arithmetic when absolutely necessary. pmf_float
is very usable and gives results of quite good accuracy, as thereโs nothing being done here that is particularly numerically unstable. So, it should be preferred over pmf
unless you actually need to know the exact rational number for some reason.
Okay, but, how fast is it, really? Oh yes, itโs time for some benchmarks! But only really crappy ad hoc ones because Iโm lazy and thereโs no reason to be obsessive about accuracy hereโฆ
But first, we need just one more implementation to benchmark: a Rust implementation of pmf_float
! I didnโt bother making a Rust implementation of pmf
because, again, I was too lazy, and I really just wanted to see how quickly I could compute ๐ฑ๐ก๐ข to an accuracy thatโs sufficient for any realistic application. The Rust implementation looks something like this:
use rustc_hash::FxHashMap as Map;
pub fn pmf(k: u32, n: u32) -> f64 {
if k == 0 {
if n == 0 {
1.0
} else {
0.0
}
} else if n < 1 {
0.0
} else {
pmf_inner(k, n, 0)
}
}
fn pmf_inner(k: u32, n: u32, d: u32) -> f64 {
fn f(cache: &mut Map<(u32, u32), f64>, k: u32, n: u32, d: u32) -> f64 {
if let Some(x) = cache.get(&(n, d)) {
return *x;
}
let denom = f64::from(2 * k - d + 1);
if n == 1 {
2.0 / denom
} else {
let sum: f64 = (d + 1..k).map(|i| f(cache, k, n - 1, i)).sum();
let r =
(2.0 * sum + f64::from(d + 1) * f(cache, k, n - 1, d)) / denom;
cache.insert((n, d), r);
r
}
}
let mut cache = Map::default();
f(&mut cache, k, n, d)
}
As you can see, this is pretty much exactly equivalent to how the Python pmf_float
function is written, with the slight exception that I donโt use the hash function from Rustโs standard library (std
), and instead depend on the rustc-hash
crate.[1]
So, uhm, letโs see the (crappy) benchmarks?:
Raw outputs
pmf2: 19.257901867997134
pmf: 7.60964249499375
pmf_float: 0.23313972300093155
pmf(20, 24) time: [50.477 ยตs 50.482 ยตs 50.487 ยตs]
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) low mild
2 (2.00%) high mild
language | function | time (ยตsโงธiter) | time (relative) |
---|---|---|---|
Python | pmf2 | 192โฏ579.02 | 3โฏ814.96ร |
Python | pmf | 76โฏ096.42 | 1โฏ507.46ร |
Python | pmf_float | 2โฏ331.40 | 46.18ร |
Rust | pmf | 50.48 | 1.00ร |
The implementations were benchmarked solely on their ability to compute ๐ฑ๐ก๐ขโก(20)โก(24). Youโll notice that I did not benchmark pmf0
nor pmf1
, because those implementations lack the memoisation required to run in a reasonable amount of time.
For Python, I used timeit
to do the benchmarking (see bench.py), and for Rust, I used Criterion.rs (see benches/pmf_benchmark.rs).
You can see the full listings for the Python implementations at rbc.py, and for the Rust implementation at src/lib.rs & src/tests.rs.
Footnotes for โPhase III: Can we go faster?โ
-
[โ] As of this writing, Rustโs standard library uses SipHash (specifically SipHash-1-3) by default. This is because SipHash makes use of an ARX structure that makes it highly resistant to hash flooding attacks, thus making it a good default choice if youโre not sure whether or not the user will need that security. However, we donโt care about that kind of security in this case, so we can just use rustcโ -โ hash to hash things stupidly quickly. If youโre wondering what rustcโ -โ hash actually looks like, it hashes a
usize
(which is a pointer-sized unsigned integer) at a time, like so (simplified under the assumption that youโre compiling for a 64-bit architecture):const K: usize = 0x517cc1b727220a95; fn add_to_hash(&mut self, i: usize) { self.hash = self.hash.rotate_left(5).bitxor(i).wrapping_mul(K); }
Note that
0x517cc1b727220a95
= 5โฏ871โฏ781โฏ006โฏ564โฏ002โฏ453 = 32ย โ ย 7ย โ ย 112ย โ ย 47ย โ ย 173ย โ ย 94โฏ732โฏ711โฏ681.
Phase IV: Where da formula @, doe?
I regret to inform you that I could not figure out a closed-form expression for ๐ฑ๐ก๐ข โ at least, not for a restrictive definition of โclosed-formโ. Boooooโฆ ๐ค๐ Embarrassingโฆโฆ
However, it is certainly possible to come up with a recursive โformulaโ modelled on the programmatic implementations that you see above. This is not a single expression, so itโs not a โclosed-form expressionโ, but it is a fixed number of expressions (really, just two expressions) that are, individually, technically closed-form โ at least, by Wikipediaโs definition of โclosed-formโ, which allows summation, so long as itโs not infinite. Basically, this is the best I got:
Raw LaTeX
\begin{align*}
\mathrm{RBC}(k, n) &\equiv f(k, n, 0) \\[1ex]
f(k, 1, d) &= \frac{2}{2k + 1 - d} \\[1ex]
f(k, n, d) &= \frac{1}{2k + 1 - d}\sum_{i = d - (k - 1)}^{k - 1}
f(k,\,n - 1,\,\max\{d - i, d, i\}) \\[1ex]
&= \frac{1}{2k + 1 - d}\left((d + 1)\,f(k,\,n - 1,\,d)
+ 2\sum_{i = d + 1}^{k - 1}
f(k,\,n - 1,\,i)\right)
\end{align*}
As you can see, there are two forms that you can choose from for the recursive bit of ๐: the one on top is modelled on pmf2
, and the one on bottom is modelled on pmf
. The pmf2
-based one actually looks a bit cleaner and less verbose, but of course, we know that itโs computationally just not as good. Plus, we avoid using ๐๐บ๐ at all in the pmf
-based one, which is nice.
So, that is (or rather, โthose areโ, as there are two versions) the closest thing that I have to a โnice formulaโ for ๐ฑ๐ก๐ข. If you are a better mathematician than I am (frankly, not a very high bar to exceedโฆ) and can figure out an even nicer formula, please please do let me know about it!
To get an idea of what ๐ฑ๐ก๐ข looks like, we can plot it! Check out this plot that focuses in on just the first 15 values of ๐ฑ๐ก๐ขโก(20):
(This, and other plots shown here, were created using matplotlib; see plot.py.)
Cool!! Iโve used cubic Hermite spline interpolation to show a smooth curve that fits all 15 points, but do remember that this is a PMF, not a PDF! There are really only 15 points (probability masses) here, and the smoothed curve is just there to guide your eyeballs.
To get a bigger picture than this, we can consider the first 100 values of ๐ instead of the first 15, and we can consider multiple values of ๐, as well:
๐คฉ Wowwwโฆ Pretty~
You can see that this distribution gets real flat-like for larger values of ๐. For ๐ย =ย 80, it nearly looks like a damn linear function with a mildly negative slope!
Phase V: How do I know what to expect?
So, we have a PMF. But we also want other nice things, like the expectation of that PMF. How in the damn hell am I supposed to figure that out?
Well, I decided that estimating the expectation was a good start. So I wrote a little Python function to do just that:
def estimate_expectation(k, epsilon):
e = 0.0
cumu_prob = 0.0
n = 1
while cumu_prob < 1.0 - epsilon:
prob = pmf_float(k, n)
old_cumu_prob = cumu_prob
cumu_prob += prob
e += prob * float(n)
n += 1
if old_cumu_prob == cumu_prob:
break
return e
The idea here is very simple: for a given value of ๐, we calculate ๐ฑ๐ก๐ขโก(๐)โก(1), ๐ฑ๐ก๐ขโก(๐)โก(2), ๐ฑ๐ก๐ขโก(๐)โก(3), and so on, keeping track of the sum of the probability masses that weโve calculated so far. For each such ๐ฑ๐ก๐ข calculation, we add ๐ย โ ย ๐ฑ๐ก๐ขโก(๐)โก(๐) to our running total, and then return that running total when the sum of the probability masses that weโve calculated so far is sufficiently close to 1.
What does โsufficiently closeโ mean? Well, that is determined by the value of epsilon
(ฮต); we just need to get to at least 1ย โย ฮต. I used sys.float_info.epsilon
for the value of ฮต, but the basic idea is that larger values of ฮต will give less accurate results more quickly, and vice versa.
By running estimate_expectation
for the first few values of ๐, I got something like this:
[
0.0,
1.4999999999999991,
2.2499999999999845,
2.9166666666666576,
3.541666666666626,
4.141666666666597,
4.724999999999947,
5.296428571428546,
5.858928571428415,
6.414484126984054,
6.964484126984016,
7.509938672438514,
8.051605339105192,
8.590066877566773,
9.125781163280985,
9.65911449661434,
10.190364496614361,
10.719776261320147,
11.247554039097999,
11.77386982857149,
12.298869828571545,
12.822679352381105,
13.345406625108346,
13.867145755543156,
14.387979088876463,
14.907979088876447,
15.427209858107194,
15.945728376625619,
16.46358551948275,
16.98082689879308,
17.497493565459727,
18.013622597717774,
18.52924759771753,
]
Naturally, these are just estimations, so we expect them to be a little bit off the mark โ specifically, a little bit lower than the true value. Nevertheless, this level of accuracy was enough to see some clear patterns. I decided to take the difference sequence of these estimations, essentially calculating estimations of ๐คโก[๐ฑ๐ก๐ขโก(2)]ย โย ๐คโก[๐ฑ๐ก๐ขโก(1)], ๐คโก[๐ฑ๐ก๐ขโก(3)]ย โย ๐คโก[๐ฑ๐ก๐ขโก(2)], ๐คโก[๐ฑ๐ก๐ขโก(4)]ย โย ๐คโก[๐ฑ๐ก๐ขโก(3)], and so on. I then went through this sequence, by hoof, and for each difference in the sequence, I wrote down which rational number it looked like it was estimating. For example, 1.4999999999999991
looks like itโs estimating 3โ2. The result was this:
3โ2, 3โ4, 2โ3, 5โ8, 3โ5, 7โ12, 4โ7, 9โ16, 5โ9, 11โ20, 6โ11, 13โ24, 7โ13, 15โ28, 8โ15, โฏ
As it turns out, the first element in this sequence (3โ2) is a red herring โ it results from the fact that weโve defined ๐ฑ๐ก๐ขโก(0)โก(๐) to be 0 for any value of ๐. By ignoring this first term, I was able to see that the terms in this sequence are just (๐ย +ย 1)โ(2โข๐), with the exception of the value for ๐ย =ย 1 being 3โ2 by fiat. It was this knowledge that I was able to come up with the expected value for ๐ฑ๐ก๐ข:
Raw LaTeX
\begin{align*}
N(k) &\sim \mathrm{RBC}(k) \\[1ex]
\mathrm{E}[N(1)] &= \frac{3}{2} \\[1ex]
\mathrm{E}[N(k)] &= \mathrm{E}[N(k - 1)] + \frac{k + 1}{2k}
\end{align*}
\begin{center}
\noindent\rule{14.2cm}{1pt}
\end{center}
\vspace*{-\baselineskip}
\begin{align*}
\mathrm{E}[N(k)] ={}& \frac{1}{2}\left(k + H_k + 1\right) \\[1ex]
\approx{}& \frac{1}{2}
\left(k + \ln(k) + \gamma + 1 +
\frac{1}{2k}\right) \\[1ex]
\simeq{}& \Theta(k)
\end{align*}
In this context, โ๐โก(๐)ย โผย ๐ฑ๐ก๐ขโก(๐)โ means that ๐โก(๐) is a random variable that has ๐ฑ๐ก๐ขโก(๐) as its distribution. Iโve called it โ๐โ because the concrete values that it takes on are values of ๐, i.e. exactly how many steps it took us to converge.
Below the horizontal line, I show an equivalent formula that apparently isnโt recursive, and thus looks like a closed-form expression for ๐คโก[๐โก(๐)]. However, ๐ป๐ denotes the ๐th harmonic number, which itself doesnโt have a closed-form definition in the strictest sense of โclosed-formโ โ thereโs just the usual โpartial sum of the first ๐ terms of the harmonic seriesโ, which may or may not be a closed form, depending on whom you ask. ๐
In any case, there is a really really (like, really) good approximation for ๐ป๐ that is most definitely closed-form, which I used to come up with the approximation that you see after the โโโ. The โ๐ ๐โ stands for โnatural logarithmโ, and ฮณ is the EulerโMascheroni constant (ฮณย โย 0.577โฏ215โฏ664โฏ901โฏ532โฏ9). Iโve also used big ฮ notation to express that ๐คโก[๐โก(๐)] asymptotically grows as ฮโก(๐) โ that is, linearly.
If you want to know just how good this approximation is, feast your eyes:
Sooooโฆ yeah. Itโs basically indistinguishable from the true value at ๐ย โฅย 10 or soโฆ If youโre wondering whatโs up with the random singularity that appears in the plot of the approximation, thatโs due to the 1โ(2โข๐) term, which comes from a truncation (yet another partial sum!) of the Taylor series representation.
Also in rbc.py, there are implementations for calculating the exact expectationโฆ:
def expectation(k):
if k == 0:
return Q(0)
e = Q(3, 2)
for i in range(2, k + 1):
e += Q(i + 1, 2 * i)
return e
โฆAnd the same thing, but using float
sโฆ:
def expectation_float(k):
if k == 0:
return 0.0
e = 1.5
for i in range(2, k + 1):
e += (i + 1.0) / (2.0 * i)
return e
โฆAs well as for the approximation:
from math import log
GAMMA_PLUS_1 = 1.577_215_664_901_532_8
def expectation_approx(k):
if k == 0:
return 0.0
k = float(k)
return 0.5 * (k + log(k) + GAMMA_PLUS_1 + 1.0 / (2.0 * k))
Phase VI: How do I know what not to expect?
Okay, so we know the expected value of ๐โก(๐), which is great. But what about other nice statistical properties? Likeโฆ
Variance
The variance? Once again Iโve no clue how I would go about finding this analytically, so weโre going to once again resort to estimations.
The variance can be expressed as simply the expectation of the square of the value, minus the square of the expectation of the value. Symbolically:
๐ต๐บ๐โก(๐โก(๐)) = ๐คโก[๐โก(๐)2]ย โย ๐คโก[๐โก(๐)]2.
Using this equation directly in computation โ when using floating-point arithmetic โ can be quite numerically unstable, but weโre going to do it anyways, because I donโt think that weโre going to have that issue (particularly, because it seems like ๐ต๐บ๐โก(๐โก(๐)) is quite large).
Thus, we can use the same strategy that we used for estimating the expectation to approximate ๐คโก[๐โก(๐)2]:
def estimate_expectation_sq(k, epsilon):
e = 0.0
cumu_prob = 0.0
n = 1
while cumu_prob < 1.0 - epsilon:
prob = pmf_float(k, n)
old_cumu_prob = cumu_prob
cumu_prob += prob
e += prob * float(n * n)
n += 1
if old_cumu_prob == cumu_prob:
break
return e
Then, itโs easy enough to straightforwardly compute an approximation of ๐คโก[๐โก(๐)2]ย โย ๐คโก[๐โก(๐)]2. I did this, and I unfortunately could not find an exact formula for the results. Instead, I fiddled around with some approximations โ using quadratic regression provided by NumPy as a starting point โ and ended up with this:
Raw LaTeX
\begin{align*}
\mathrm{Var}(N(k)) &\approx \frac{3k^2 + 10k - 4}{12} \\[1ex]
&\simeq \Theta{\left(k^2\right)}
\end{align*}
๐คท๐ฝโโ๏ธ Sure, why not. It looks nice. And, it seems to be a pretty solid approximation! The major takeaway here is that ๐ต๐บ๐โก(๐โก(๐)) is very much quadratic in ๐, which makes sense, as we expected the variance to be quite large on the basis of those plots of ๐ฑ๐ก๐ข that we saw above. Note that this is very comparable to the variance of the uniform distribution, which is just (๐ย โย ๐)2โ12, where [๐,ย ๐] is the support of the distribution. Whewโฆ thatโs really saying something!
Median, mode
What about the median? The mode? Both of these can be calculated. For the median, you can do something like the estimate_expectation
function above, except you donโt have to keep a running โexpectationโ (e
) total, and you simply stop as soon as the cumulative probability is 0.5 or greater. For the mode (which is just the peak of the distribution โ for a given value of ๐, this is the value of ๐ that maximises ๐ฑ๐ก๐ขโก(๐)โก(๐)), you can do something like:
def mode(k):
p = 0.0
n = 1
while True:
prob = pmf_float(k, n)
if prob > p:
p = prob
else:
break
n += 1
return n - 1
We know the shape of the distribution, so we just wait for it to start decreasing, and we know that weโve already seen the mode.
I didnโt go any further with these two measure of central tendency, as I was very much not confident in my ability to find some patternโฆ In any case, hereโs a table of some modes, if that makes you feel better:
mode | ๐ |
---|---|
0 | 0 |
1 | 1 |
2 | 5 |
3 | 13 |
4 | 26 |
5 | 45 |
6 | 70 |
7 | 103 |
8 | 142 |
9 | 188 |
10 | 242 |
The ๐ values here are the absolute lowest ๐ values for which the mode is as listed.
Entropy
What about the information entropy[1] of this distribution? If youโre not already familiar with the idea of entropy, the basic idea is that less likely outcomes carry more โsurpriseโ or โinformationโ with them. A simple example might be if you got a single bit sent to you (say, via telegraphy) every 24 hours that indicated whether or not it was raining in a particular locale. If that locale is considered to be arid, then most of the time, youโre going to get a โnot rainingโ message. Thus, getting a โnot rainingโ message gives you very little information; you are not very surprised. On the other hand, on the occasions where it does rain, and you get an โitโs rainingโ message, thatโs a lot of information & surprisal โ you wouldnโt have guessed that it was raining, had you not gotten that message. If we have an idea of the probability for each outcome (for โnot rainingโ and for โitโs rainingโ), then we can calculate the amount of information/surprisal that we get from a single telegraph message in general, without knowing the actual contents of the message โ itโs just the expectation of the surprisal (the surprisal of each outcome, weighted by the probability of that outcome, all summed together).
Whatโs the โinformation contentโ[1] of a particular outcome? Well, as it turns out, itโs basically just โ๐ ๐๐โก(๐), where ๐ is the probability of that outcome. If youโre wondering what in the gosh darned heck logarithms have to do with this, one thing that can help it make intuitive sense (if youโre like me and have computer brain) is thinking of bit strings: a bit string of length ๐ has 2๐ possible values, so to go the other way (from 2๐ to ๐), youโre going to have to undo that exponentiationโฆ with a logarithm. In particular, if we have โ for example โ a uniform distribution over all possible values of the bit string, then the probability of any one outcome is ๐ย =ย 2โ๐. Then, for any ๐, โ๐ ๐๐โก(๐)ย =ย โ๐ ๐๐โก(2โ๐)ย =ย ๐.
In any case, we can again use the same technique that we used for estimating ๐คโก[๐โก(๐)] and for estimating ๐ต๐บ๐โก(๐โก(๐)) to estimate ฮโก(๐โก(๐)). Note that the โฮโ in โฮโก(๐โก(๐))โ is an uppercase Greek letter โetaโ (U+397), not an uppercase Latin letter โ(h)aitchโ (U+48); the โฮโ stands for the initial โจeโฉ in the word entropy.
Hereโs a Python function that does just that:
import math
def estimate_entropy(k, epsilon):
eta = 0.0
cumu_prob = 0.0
n = 1
while cumu_prob < 1.0 - epsilon:
prob = pmf_float(k, n)
old_cumu_prob = cumu_prob
cumu_prob += prob
try:
eta += prob * math.log(prob)
except:
pass
n += 1
if old_cumu_prob == cumu_prob:
break
return -eta
I did something similar to what I did with ๐ต๐บ๐โก(๐โก(๐)), except this time, I had no automated regression to help me out. I probably could have written something by hand to assist in this procedure, but I honestly wasnโt sure quite what I was looking for. So instead, I literally just tried some approximations and visually looked (on a plot) at how they compared to the estimated values of ฮโก(๐โก(๐)), and then did adjustments by hoof as I saw fit. The result is this halfway-decent approximation:
Raw LaTeX
\begin{align*}
\mathrm{H}(N(k)) &\approx \frac{9(\ln(k + 1) + \ln\ln(k + 2)) + 4}{12} \\[1ex]
&\simeq \Theta(\log k)
\end{align*}
Soooโฆ yep. I was debating over the use of the โ+ย ๐ ๐โกย ๐ ๐โก(๐ย +ย 2)โ term, and ended up deciding to use it. Regardless of this term, the result is asymptotically logarithmic in ๐.
Letโs have a look at how good this approximation is (the blue line is the estimated values, and the chunky dashed reddish line is the approximation):
Eh. Itโs alright, I guess. ๐คท๐ฝโโ๏ธ Iโve got two Y-axes for ease of reading; the one on the left is in nat (nats), and the one on the right is in Sh (shannons; the information-theoretic analogue of the bit).
Footnotes for โEntropyโ
- [โ] See also: โInformation contentโ.
I donโt think that any of that is going to help us clear Pianusโฆ
So, there you have it. No one asked for any of it, but there it is. Youโre very welcome. ๐
If, for some godforsaken reason, this actually interested you (seems highly unlikely to me, but then again, what do I know about probability?), Iโll just note that thereโs more that can be done. ๐ฑ๐ก๐ข could use a better formula, maybe. An exact formula for ๐ต๐บ๐โก(๐โก(๐)) and/or ฮโก(๐โก(๐)) โ or better approximations โ would be nice. Maybe a formula for the median and/or modeโฆ Plus, you can consider other related distributions. For example, how is the value of ๐ distributed, given a value of both ๐ and ๐?
Have fun! :3
Habyte maketh no monke, ne vvearynge of gylte ลฟpurres maketh no knyght
With this 2022 AnniversaryโSummer event, I โ as usual โ took a look at the unique hairs, faces, & NX equipment offered by the new event. The way that it usually goes is:
Hmm, none of these faces look like MapleStory faces. I guess these are all from, like, MapleStory version three thousand or somethingโฆ? Thatโs too badโฆ To be honest, I donโt really like any of them.
None of these hairs really look good to me either. Well, I admit that that one seems like it could be pretty cute, but itโs really not my style. Plus, whatโs with all of the hairs looking super washed out? They all look white no matter what colour they areโฆ
At least some of these NX equips look neato. Maybe Iโll get a fewโฆ maybe do a few wardrobe changes on a character here or thereโฆ
Well, this event actually came with two faces that I did kinda like: the Charming Face and the Calm Lucid Face (specifically, I had in mind the female versions, although the male versions are only subtly different). And furthermore, this event came with a hair that I really like: the Balayage Hair (again, specifically the female version; in this case, the male version is completely different)! So, I thought hard about it, and I decided that I wanted to give my woodsmaster capreolina a makeover!! After a lot of fiddling around in maples.im and getting some useful opinions from Lv1Crook (Level1Crook, Lvl1Crook, xXCrookXx) and Harlez (VigiI), I came up with a look, and grinded out Summer Points to get itโฆ:
👀
Cute, right??? For this image, Iโve specifically chosen the Kandiva to match the red eyes, shirt, facepaint, and the kinda reddish highlights.
I had a kinda โleft overโ look from my experimentations that led to capreolinaโs new look, and I was really satisfied with how I felt about giving capreolina a new look. So, I was emboldened to do the same for my pure STR bishop cervidโฆ!:
YAY I LOOK ADORABLE LOL
I originally was going to use a different cape, and then realised that the one that I had used in maples.im (which I thought that I remembered checking for in-game) was only purchasable with donor pointsโฆ and I didnโt have any. Oh well. This cape is cool too.
I think that both of these looks do a pretty good job of preserving the original โvibesโ that I was going for with these two charactersโ outfits: cervid still has the dark โgothโ vibe, and capreolina still looks like she either lives in a tree, or could chop a tree down with her bare hands. However, in another sense, the tables have turned: capre now looks gayer than cervid, instead of the other way around!! In any case, these are my original two characters, so I think that it makes sense for them to both have the same hair style :P
Some other people have agreed that the new looks are cute:
However, the reception has not been entirely positiveโฆ:
Rest assured that I am the real deer, and I may dress myself however I please!! So there!!!
What hath Dunas wrought!
Over in Neo Tokyo, on my darksterity knight rusa, I did some Bergamoting with marksman xBowtjuhNL, paladin aWildLynux, and bishop Pebbytan:
O Dunasโฆ
Afterwards, I trioed Dโuna with them (minus aWildLynux):
This was rough. Dโuna is already a massive pain in the ass for me, as a melee attacker, thanks to his insistence on constantly dispelling (DPing) anyone within melee range, and frequently stunning, super-knockbacking, Darknessโing, etc.โฆ And zerking doesnโt exactly make it easier to survive DR consistentlyโฆ! But itโs also not so easy for bishops either, despite the fact that they are mages and are thus unaffected (barring STR mages & the like) by physical damage reflection (physical DR). Bishops have to get close enough to Dispel Dโunaโs pesky buffs (particularly the WDEF onesโฆ), and yet this puts them within Dโunaโs DP range, thus endangering them by possibly DPing their Magic Guard. As a result of all of this, our first run was a failure: both Pebby & I died. ๐ญ We did better on the second time around, though :]
Did you know that itโs possible to get up into the branches of the tree at the Forest of Oblivion (the map that connects the Mushroom Shrine to Kamuna)?:
Yep! And it requires โฅ123% JUMP โ and thus exactly 123% JUMP, as that is the maximum possible JUMP.
I also did Bergamot with bowmaster xRook (misandrist), who had never done Berga before! We were accompanied by fellow bowmaster OliverQueen and shadower Harlez (VigiI). During the beginning of the first body of the first run, we had xRook crouching and watching so that they would see what Bergamotโs DP animation actually looks like:
Getting DPโd is potentially a big hazard for xRook, as they rely on my HB to survive. Of course, any time that Bergamot does the laser DP animation, Iโm always watching very intently to see if anyone does get hit by the smol laser (the one that actually DPs), so that I can quickly rebuff them โ which often involves quickly mounting and running to the left, so that Iโm within HB distance of them after they get knocked back by the laser.
With that, xRook got the hang of it, and was able to survive the whole run, without the assistance of any bishops!:
I also duoed Berga, with xBowtjuhNLโฆ:
It actually wasnโt so bad; we didnโt have any serious hitches, and the run took around โ55 minutes in total. The drops didnโt quite make it worth it, but I guess they were okay โ Shield for WATK 70% ainโt half bad (we were both 5โงธ5 on cards already):
I also did a trio Nibelung run with xBowtjuhNL and Harlez. xBowtjuhNL said that this was going to be his last run before going to bed, and indeed, he fell asleep during the runโฆ:
Eventually, as you can see, I was able to wake him up by calling him on Discordโข! You can also see that I was having some pretty nasty networking issues at the timeโฆ
In the end, though, it was a great run! I levelled up to 173~!:
And Nib was kind enough to drop an EP for us!!:
Plus, I got my first Nib card :]
With the splits from selling the EP โ which I got from Harlez in pCoins, rather than in mesos โ I was able to make another Chaos Scroll, so that I could give another shot to that 4 WATK, 4 slot PGC that I made in the previous entryโฆ:
Oh. Well, thatโs a 0 WATK PGC. Maybe next timeโฆโฆ ๐ข
Swiftness is the essence of warfare
I have procrastinated long enough on getting a Silver Mane for my woodsmaster capreolina. The Silver Mane quest ainโt easy โ and it ainโt cheap, either โ but the only pre-requisites are being level โฅ120[1], and already having the Hog, both of which capreolina has satisfied forโฆ well, like 22 levels now. So, I headed out into the wilderness of the Minar Forest with fellow fourth-job archer Level1Crook (Lvl1Crook, xXCrookXx) to collect the 600 Rexton Leathers, 400 Wooden Shoulder Pads, and 400 Skull Shoulder Pads that we each (yes, thatโs 2โฏ800 big fat Leafre ETC items in total!) needed for our respective Silver Mane quests:
Oh, I died. How embarrassingโฆ
Well, I was already pretty close to levelling before we started, so it was no surprise when I levelled mid-grind!!:
After getting our 1.2k Rexton Leathers, we headed to The Hidden Dragon Tomb II for the rest of the ETCs:
I dyedโฆ againโฆ how embarrassingโฆ Now, Iโm not going to say exactly how many times that I died, but I will say that whoever runs the โCash Shopโ was very happy about their Safety Charm sales on that dayโฆ
Not gonna lie, 2.8k big Leafre ETCs is way too fucking many. We almost called it a day after we had just a few hundred ETCs left to go, but we agreed to just trudge on and get the entire damn thing over with. The result was that we both suffered mild brain rot that we may, or may not, ever recover from. But we did it. We got the things.
Of course, there was also the issue of getting 50M(!) mesos. For mine, I decided to just steal transfer 50M mesos off of my vicloc dagger spearwoman d34r. And with that, we were now officially Fast As Fuckโข:
NYOOOOOOOOOOOOMโฆ
Footnotes for โSwiftness is the essence of warfareโ
- [โ] As of this writing, the Silver Mane quest in MapleLegends is bugged, so this is not strictly true for the MapleLegends implementation. As far as I know, the MapleLegends implementation (again, as of this writing) has no level requirement at all, and is instead gated on the PCโs class: namely, they must be a fourth job class, or a beginner.
A clock; a cursรจd tree; a cryptid; a character of evil; a captain.
I did a few fun all-odd Capt. Lat runs on my darksterity knight rusa with STRginner extraordinaire Taima (Girlmoder, Boymoder, Nyanners, Tacgnol, Hanyou), helping to HB permabeginners Cortical (GishGallop, Medulla, SussyBaka, RedNurJazlan, CokeZeroPill) and uayua (2sus4u, hueso, shadowban, tb303):
A grandfather clock; a Guatemalan statue; a great foot; a god; a ghost pilot.
Welp, we had some deaths, but thatโs to be expected when youโre constantly on the brink of being two-shot by a stray ranged attack from the olโ Captain.
I also did a single duo Peppered Lattice run on my I/L archmagelet cervine, with marksman Level1Crook (Lanius, Lvl1Crook, xXCrookXx), in which I did someโฆ I donโt know, pretty okay CL damage? To Papal Lettuce? Next time, Iโll have to use some actual MATK potions ๐
That being said, the amount of AVOID that I get from being a magelet really helps when fighting bosses like this; I pretty much stayed on the top platforms for the entire run, so I never got dispelled (DPโd) even once!
We also teamed up with PoultryWoman (Yoshis, Furbs, SwordFurb) and SaviorSword to do some Ravs (with me as my pure STR bishop cervid)โฆ:
โฆAnd some Peppy Laters (with me as my woodsmaster capreolina):
And, with marksman xBowtjuhNL and shadower Harlez (VigiI), we did some fun Poppy Lanternsโฆ:
โฆDuring which, Taima took advantage of her @roll 100
for loot order being the highest byโฆ looting the one โโโdilkโโโ that dropped:
๐คท๐ฝโโ๏ธ
And, finally, when swashbuckler Yoshis hit the big level 115, it was time for her to try Ravving, too (featuring F/P archmage 2sus4u)!:
SMH my headโฆ I cannot believe thisโฆ How could Level1Crook do this to us?? He was our only MCP-openerโฆโฆ
On cervid, I also helped out xBowtjuhNL, OliverQueen, BeMyRavebae, BimboBucc, and atsumoo sell some AFK zhelms!:
It was a little weird not being the only active bishop in the party, but it worked out well enough:
I also did some Zakking on capreolina, where โ as usual โ I got some weird looks when I bust out the olโ PSB for the arms stage:
I also did anโฆ interesting Zak run on rusa, wherein we lost our bishop and bowmaster quite early on during the arms stage, leaving just me, Harlez, nightlord Peeko, and paladin SirAmikโฆ And then SirAmik died during the first body. So, yeah, it was kinda a trio Zak without even so much as SE:
So, yeah, that was pretty brutal. And by the end of it, only one third (โ ) of our original party remained ๐:
And still, no pouchโฆ ๐
Acer
Over on Maple Island, I was killing some Zombie Lupins on my islander ozotoceros:
โฆWait, what? Oh, right, thatโs BRPQ (Boss Rush Party Quest; BPQ), not Maple Island. Naturally, the Faust inside of BRPQ spawns zlupins just like the ordinary Faust does, so itโs a favourite pastime of islanders during every summer event to try to squeeze a Red Whip out of one of โemโฆ
Unfortunately, the Red Whip drop rate is not great, so I didnโt get any. :P But I did get a fancy Dark Pilfer! And some Cursed Dollsโฆ
In any case, I did amass enough anniversary points to purchase myself a fancy-dancy claw: the Sweet Fork Cake!!:
And, after a sufficient amount of prayer directed at the Maple gods, I was blessed with a set of Maple Throwing-Stars that I got from a Maple Equipment Box that I bought with more anniversary points!!!:
For those not aware, Maple Throwing-Stars are extremely valuable on Maple Island, which may come as a surprise given that they are virtually worthless elsewhere. The extra +6 WATK over what youโd get from Subis makes a huge difference (as your total WATK is naturally low anyways), and itโs not like thereโs any way to get Ilbis on Maple Islandโฆ
Oh, and I hit level 46~!!:
The raffle loves giving me useless equipment like Maple Guns, Maple Crows, and most of all: Red Bandanas.
๐คฆ๐ฝโโ๏ธ
And, of course, you know I finished all five weeks of DJ JMโs quests! So I walked away with my chair of choiceโฆ
Iโve seen some people in alliance chat hating on my homegirl Pia, and all I have to say is that Pia is the O.G. Pia has been in Henesys Park since before yโall were even old enough to operate a keyboard, so put some damn respect on her name!!
A squamate appears
As teased in the previous diary entry (which I started writing before I actually turned in the corresponding quest), I got the next Monster Book Ring tier for my darksterity knight rusa: T4!
And just in time for another Horntail run!:
Our nightlord Brokeen accidentally entered the main body map just a fraction of a second earlier than shadower Harlez (VigiI) โ our intended seduce target โ which is why you see him dead in the above image.
Unfortunately for me, I got what I asked for by screenshotting someone elseโs death, and a later screenshot caused my client to crash when all three heads were still alive. โน๏ธ
You can see corsair runYoShit in the screenshot above, which is notable because runYoShit was relying on my HB to survive, so my crashing meant that they didnโt get to continue the run either!! ๐ญ
In spite of this, I nevertheless signed up for the next run that xBowtjuhNL hosted, mistakenly believing (not without reason) that the screenshotting was what caused the crash โ after all, the crash happened instantaneously after hitting the Prt Sc key, which activates the clientโs built-in screenshotting ability (which regrettably cannot be unbound from said key):
So, true to my word, I did not screenshot. And, in spite of this, I crashed at the beginning of the main body fight anywaysโฆ:
This really took the wind out of my sails, as I was not only saddened that I couldnโt HT anymore (as I quite like HTing, and itโs good EXP to boot!), but was also ashamed that I had let down runYoShit on more than one occasion as a result of my own technical issues effectively killing them. I posted about these technical issues, and some related ones, that are associated with the 2022 AnniโSummer event patch โ and that I could confirm were not unique to me, i.e. I was getting multiple identical experiential reports from other players โ in a MapleLegends forum post entitled โClient crashing/stability/UB issues since AnniโSummer event patchโ. Unfortunately, w.r.t. the specifically crashing issues, the client appears to be unwilling to emit crash logs, so the reports that I give are purely phenomenological.
Iโve even desperately attempted to run the game client natively on Windowsโข via multi-booting, just for the purpose of not crashing in crash-y situations (read: Horntail). I know, I know, Bill G8s owns my computer now and probably knows all of my deepest darkest secrets, BUT I REALLY WANT TO HT, OKAYโฝโฝโฝ Plus, I always make sure to wash my hooves after using Windowsโข. :] Unfortunately, it seems that running the MapleLegends client on Windowsโข is even crashier than running it in Wine on Linux like I normally do โ I donโt even have to go into HT (nor do anything special, reallyโฆ) to crash! But Iโm not terribly surprised, considering that running ML in Wine has always had a much better track record โ for me, at least โ than itโs had for those running the game natively.
Dark doubts between the promise & event
As the 2022 AnniversaryโSummer Event has continued, I have continued BRPQing โ including a pair of BRPQs that I did on my vicloc dagger spearwoman d34r, alongside STRginner Cortical (GishGallop, Medulla, SussyBaka, RedNurJazlan, CokeZeroPill). We fought Franky, everyoneโs favourite PQ boss:
And I then realised my limitations when our party arrived at Manon, and I could scarcely hit the damn thing โ itโs 19 levels higher than me, and has 39 AVOID!:
Good thing Manon isnโt real and Leafre is a hoax perpetuated by so-called โfourth-jobbersโ, so I donโt have to worry about it outside of BRPQ.
On my darksterity knight rusa, I did enough BRPQs and Maple Syrupโd enough Maple Leaves to make a pretty awesome one!:
8 STR and 3 DEX on a face accessory? Yell heah!!
I also ended up doing a number of solo BRPQs, as a result of underwhelming traffic at BRPQ around the times that I was logging on. Like on my I/L archmagelet cervine:
And on my daggermit alces:
Unfortunately, both of them ended up wasting most of their points on failed Maple Leavesโฆ But cervine still has a usable clean one, so thatโs good, at least.
Oh, and I did manage to get just one (1) good raffle between my two viclockers: a Rabbit Lamp Chair!!:
Normally I really couldnโt care less about chairs (other than their value on the market when I rid myself of them), but I do enjoy getting chairs on my area-restricted characters, i.e. my islander and my viclockers. :]
็ปๅฏ
็ปๅฏ
An ongoing investigation into โโโโโโ โโโโโโโโโ by the name of โโโโโโโ found โโโโโโ โโโโโ โโโโ โโโโโโโโโโโโโโ in addition to โโโโโโโโโ activity near the 101st floor of the Eos Tower, where the subject was reported โโโโโโโโโ โโโโโโโโโโโโโโ โโโโ, although โโโโโโโโโ โโโโโโโโ:
โโโโโโโโโ โโโโโโโโโโ โโโ โโโโโโโโโ โโ and โ it is suspected โ โโโโโโโโโ โโโโโโ. โโโโโโ agents โโโโโโโโโ and โโโโโโโ were reported dead after โโโโ โโโโโโโโโ โโโโโโโโโโโโโ โโโ and attempting reconnaissance on the subject:
It should be reiterated that the subject is armed & dangerous, and only โโโโโโโโ agents with high levels of clearance, training, & expertise are allowed on this case. It should also be noted that โโโโโ โโโ โโโโโโโโโโโโโ โโโโ โโโโโโโโ, with โโโโโโโโ โโโโโโ and access to highly advanced technology, including โโโโโโโโ โโโโโ and โโโโ โโโโโโโโ โโ appears to be some kind of temporary invincibility, pending further investigation:
Concerningly, the subject โโโโโ โโโโโโโโ โโโโโโ โโโ, gaining access to the highly restricted โโโโโโโโโ โโโโโโ โโโโโโโโโโ Eos Tower spire, where โโโโโโ โโโโโโ killing โโโโโโโโ and โโโโโโโ (nippled clockwhale specimen):
CURRENT STATUS: TOP PRIORITY. ALL DISCRETIONARY RESOURCES ARE TO BE DEDICATED TO THIS CASE.
R. t. tarandus
As usual, itโs time for a quick episode of everyoneโs favourite series involving a pugilist whose name starts with a โจtโฉ and ends with an โจsโฉ: Questing With taraโข!
This time around, I headed to Singapore to do some more Capt. Latanica prequests, now that I was high enough level to continue them. I got the necessary Selkie Jr. and Slimy kills at MP3โฆ:
Questing With tara™
โฆAnd the necessary Mr. Anchor kills at GS6:
With that, I was now at the final part of the questline: to slay Capt. Latanica and bring back his Black Essence. Knowing that I wouldnโt have accuracy issues due to Capt. Latanicaโs abysmally low AVOID, I decided to just give it a shot and try soloing the Captain:
By the end of it, I had used three Unripe Onyx Apples and one Maple Music Fever, andโฆ well, I decided against wasting more resources & time getting irritated at how little damage I was doing to Lat. Could I have finished? Sure. Thereโs no timer, so I could just eat all of my resources and chip away endlessly at Lat until he finally submitted to my unending patience. But I wasnโt really willing to do this unless I brought some full-blown Onyx Apples with me, so I bailed. I figured that it would have been easier, given that Lat has โonlyโ 700 WDEF, but I neglected to take into account Capt. Latโs 37-level advantage. ๐คฆ๐ฝโโ๏ธ
Of course, not only was this a waste of the WATK potions that I used, but it was also a waste of the White Essence that I got from โThe great secret revealsโ. But that was okay, because I needed to do The Lost White Essence anyways โ for the question completion, if nothing else. So I grinded some more GS6 to finish that quest, and got my White Essence back.
The problem was that, now that Iโd done The Lost White Essence, actually successfully soloing Capt. Lat would give me enough EXP to level up! And how tf am I supposed to wear my INTelligent pyjamas if Iโm soloing the Captain?? ๐ So I decided on the โlevel up, and then come backโ route, even though it would mean soloing Capt. Lat at level 64 at the lowest, rather than 63 as I had originally planned (as 63 is the absolute minimum level to have the Capt. Lat kill quest active). So, itโs not as cool, but whatever. Iโm still slightly torn on whether or not I should come back with some Onyx Apples at level 64 to solo this big guy. Itโs kind of stupid & a waste, but โdoing conventionally inadvisable things because I canโ is kind of a running theme of my diary, so, you knowโฆ Let me know if you have an opinion on the matterโฆ
In any case, for the whole levelling-up thing, I decided to start on the MLG[1] quests. So I headed to Mu Lung to punch some STDs (not to be confused with STDs) for No Gongโs Teaching:
And, welp, that was enough for level 64~!:
I also took a detour to Victoria Island to do some Evil Eye slaying for the latter part (the 999-kill quest) of POLLUTED! โจ1โEvil Eyeโฉ, alongside Vicloc I/L gish VigiI (Harlez):
Tune in next time, to Questing With taraโข, to learn the fate of the captain known as โLatanicaโโฆโฆ
Footnotes for โR. t. tarandusโ
- [โ] โMu Lung Gardenโ, the region that encompasses all of the Mu Lung and Herb Town maps.
The Room of the Guilty
And, finally, I did one (1) OPQ. On my OPQ mule, sets. With DEXginner inhale (DexBlade, vork) and disgraced[1] former Viclocker Lanius (Level1Crook, Lvl1Crook, xXCrookXx). After a great deal of struggle within the Tower of the Goddess, we killed Father Pixel, with the assistance of an aubergine dressed as a human by the name of AgentPewPew:
The random folx (including AgentPewPew) that weโd partied with were obsessing over my damage and equipment (which is almost entirely borrowed from my other charactersโฆ), but I was more interested in inhaleโs damage (see โComparing DEXginnersโ in the previous diary entry).
In any case, it was in this PQ that inhale hit the big level 70~!! Oh โ and I also levelled up, to 67 :]
โฆAnd then I passed tf out. ๐๏ธ๐ด๐ค
Footnotes for โThe Room of the Guiltyโ
(โฆcnvpstdfโฆ)
cnvpstdf
This diary entry is dedicated to Pebbytan, who was permanently banned from MapleLegends without possibility of appeal, without having broken any of the rules of the server, merely through guilt by association. โค๏ธ