Sorry about not saying anything for a while. Been distracted with work and AoC.
Source: Gwayne
But I would like to see the maths behind the "expected number of hits before n hits in a row" formula. I've been trying to work on something similar and I get different results: at least for n=2 and a=0.5 I get 4 hits as Expectancy value for the time until 2 hits in a row. Your formula gives 6 hits.
Either I'm doing something wrong, or you are including the 2 hits in a row in your number, or I'm taking the "expected" a bit too literally as the mathematical Expectancy Value, while you mean slightly different.
Could you give your derivation of that formula please? |
I am including the two hits—it's the "expected number of swings until you have just been hit for the nth time in a row", more or less.
The derivation:
T_H = 1 + (1-a) 0 + (a) T_H
T_H is our basic reaction time metric: The expected time until we're hit. Here it's expressed in terms of a recurrence. The number of swings until we're hit is one (the next swing), which can either be a hit with (1-a) chance (which means zero more swings are needed), or a miss with (a) chance (which means T_H more swings are needed.)
This recurrence technique is commonly used for algorithm analysis in computer science, because many algorithms may be described in terms of recursion, which this models pretty well. In this case, we're saying "In each recursive step we do one unit of 'work'" (this is the "1"), and then we describe the probability of the recursive sub-steps and their values.
So we take the above, and simplify out the zero term:
T_H = 1 + a T_H
And do some other algebra to get T_H alone on one side:
T_H - a T_H = 1
T_H = 1 / (1 - a)
And now we've derived the basic T_H metric using a recurrence, just to show how the technique works. Once I got this, I of course realized "Duh! Of course, the time between events of probability p is 1/p." But it's nice to show with a very simple example that this analysis technique works.
So, on to the real show:
T_B(1) = T_H
by definition: The time until we get a burst of length one is the same as the time until we're hit.
T_B(2) = T_H + 1 + (1 - a) 0 + a T_B(2)
The time until being hit by a burst of size two is: The time to get hit the first time (T_H), plus one more swing (1) which may either be a hit (zero more swings needed with probability (1-a)) or a miss (we have to start all over again and add in T_B(2) with probability (a)).
By the same steps as before (after swapping the position of 1 and T_H for simplicity):
T_B(2) = 1 + T_H + a T_B(2)
T_B(2) - a T_B(2) = 1 + T_H
T_B(2) = (1 + T_H) / (1 - a)
Substituting in the value of T_H:
T_B(2) = (1 + 1 / (1 - a)) / (1 - a)
T_B(2) = 1 / (1 - a) + 1 / (1 - a)^2
These last two can be written instead:
T_B(2) = (1 + (1-a)^-1) * (1-a)^-1
T_B(2) = (1-a)^-1 + (1-a)^-2
At this point you may already have an inkling of where this is going. Let us make it explicit and look at T_B(3):
T_B(3) = 1 + T_B(2) + a T_B(3)
First, we have to get a burst of length two. Then we have one more swing. If it's a hit, we're done. If it's a miss (with probability a), we need to reset and start over again. Analysis is identical to T_B(2), mostly:
T_B(3) - a T_B(3) = 1 + T_B(2)
T_B(3) = (1 + T_B(2)) * (1 - a)^-1
T_B(3) = (1 + (1-a)^-1 + (1-a)^-2) * (1-a)^-1
T_B(3) = (1-a)^-1 + (1-a)^-2 + (1-a)^-3
And now we definitely see the pattern. Somebody with more math chops than I have could probably just write:
T_B(0) = 1
T_B(n) = (1 + T_B(n-1)) * (1-a)^-1
and go straight to the conclusion using something clever. Anyway, our pattern is this:
T_B(n) = \Sum_{k=1}^n r^k
where r is (1/1-a). This is a geometric series. There is a general formula for finite sums over such series (ASSUMING R IS NOT EQUAL TO ONE. Which in our case means that this equality does not hold if avoidance is zero):
\Sum_{k=0}^m r^k = (1 - r^(m+1)) / (1 - m)
In order to get our series into this form, we need to do this:
T_B(n) = r \Sum_{k=0}^{n-1} r^k
r^0 is always 1, so this is r*1 + r*r + r*r^2, or r + r^2 + r^3, which is what we want. Substituting in {n-1} for m:
T_B(n) = r \Sum_{k=0}^{n-1} r^k = r (1 - r^({n-1}+1)) / (1 - {n-1})
simplifying:
T_B(n) = r (1 - r^n) / (1 - r)
... lots of messy stuff left out as I tried to simplify it ...
T_B(n) = (1 - (1 / 1-a)n) (a / (1-a)^2)
This formula was quite unsatisfying to me, but I couldn't come up with any way to simplify it further. Or at least, I couldn't until I went further and tried to derive T_H(n) from it.
T_H(n) = 1 + a (T_B(n-1) + T_H(n))
That is to say, the time until we've just been hit for the nth time in a row, given that we've just been hit for n-1 times in a row, is equal to 1 swing, which may either be a hit (we're done), or a miss (probability a), in which case we must first be hit by a burst of size (n-1) and then hit another T_H(n) times. (i.e. Either it's one swing and we're hit and we're done, or it's one swing and we're missed and then we must be hit n-1 times from a state of being missed before we can add in T_H(n) again, since T_H(n) assumes we've just now been hit n times in a row.)
If we run through this, we get to:
T_H(n) = (1 + a T_B(n-1)) / (1 - a)
through exactly the same manipulations above. This doesn't simplify very well. However, when you plug numbers into it, you notice that it behaves exactly like 1 / (1-a)^n. And then (if you're me), you realize that you should've known that in the first place. OF COURSE, T_H(n) = 1 / (1-a)^n. That should have been obvious!
But that means we have two different formulas for T_H(n), and one of those is very very simple, and the other one is very very simple when given in terms of T_B(n-1). That means it gives a second way to approach T_B(n), one which allows us to simplify much more easily:
T_H(n) = 1 / (1 - a)^n = (1 + a T_B(n-1)) / (1 - a)
1/(1-a)^n = 1/(1 - a) * (1 + a T_B(n-1))
divide both sides by 1/(1-a):
1/(1-a)^(n-1) = 1 + a T_B(n-1)
flip it around:
1 + a T_B(n-1) = 1/(1-a)^(n-1)
Subtract one from both sides:
a T_B(n-1) = 1/(1-a)^(n-1) - 1
Divide both sides by a:
T_B(n-1) = (1/a) (1/(1-a)^(n-1) - 1)
And replace n-1 with n on both sides:
T_B(n) = (1/a) (1/(1-a)^n - 1)
And this is exactly the formula I gave above (and it does give the same answers as the more messy formula for T_B(n) I derived above, as well, unless I mis-stated something.)
All together, a terribly messy and slipshod piece of work. But it results in a fairly elegant little formula, so I am happy.