# Probability that a < b^2?

Let's choose numbers "a" and "b" randomly and uniformly from the entire set of real numbers. What is the probability that a < b^2?

Update:

Duke:

If P(a < b^2) --> 1

then that would mean

P(a > b^2) --> 0

right?

Update 2:

Interesting discussion so far, I just find it so counter intuitive that P(a>b^2) could actually be zero.

Update 3:

Well would you believe it? I ran a simple program in matlab which generated a random gaussian variable N(0, σ) for a and b.

My approximate results were

p = P(a < b^2) ~= 0.7, when σ=1

p~= 0.9, when σ=10

p~=0.97, when σ=100

p~=0.999, when σ=1000

Interesting indeed.

Update 4:

scythian: "almost all the time" and "all the time" are two different things. P(a > b^2) = 0 means it cannot happen. But we know that whatever "b" is, we can always find a value of "a" greater than b^2. That's what I meant by 0 being unintuitive. Nevertheless, it does appear that more than one method is leading to zero as the answer.

Update 5:

OK Jered, but my point still stands. If you pick a random real from 0 to 10, the prob of an integer (or rational number) is zero BECAUSE these are point events which have no area under the f(x) curve.

This is NOT the case with a > b^2. For every b, there is a range of a that satisfies the inequality. In fact for every range of b, there is a range of a satisfying it. So it's not a point. It's a definite area with a non zero probability. However I agree that as the domain increases, the probability tends to zero.

Update 6:

The same ideas hold for scythian's arctan(a/b) sector. The ratio of the area tends to zero, but the area itself is never really zero. That's what I find un-intuitive. I guess un-intuitive is a subjective word.

Update 7:

But I guess that more or less clears up my doubts. In the case of picking a random real number between 0 and 10, P(integer) = 0 even though it's possible to pick an integer. Here P(a>b^2) --> 0 although it's still possible for it to happen.

Update 8:

Update 9:

Merlyn: The consensus so far, as well as from Remo's question, is that it all depends on how you choose the points on the infinite plane. My feelings are that the answer should not be different if the method is different, otherwise we have a paradox (call it by some other name if you wish).

Some of the answerer's have chosen a,b in the domain (-N, N) and taken the limit as N --> ∞. They got 1. Now the other method is using a known infinite distribution and taking σ --> ∞. Again the result is 1.

Why is it a good idea? If nothing else, it confirms the answer using a different method.

Relevance

Math guy: You could argue equally well that the answer must be more than 50%, because it will always work when a is negative, which is 50% of the time, and it will work for a non-zero proportion of the time when a is positive too.

Whitesox: I don't know where you get your upper limit of 0.875 from. You have four cases each with overall probability 0.25; you give the contributions from two as 0.25 (which could have been made much quicker by noting that obviously if a < 0 then a < b^2). The other two have contributions 0.125 + ???? and ????; I don't see any restriction on the values that would mean the total cannot be 1. You have no cases with a definite contribution where the contribution is less than the probability of the contributing region.

[EDIT: Remo Aviron - Your contention that choosing positive b at random is the same as choosing b^2 at random IS WRONG. Choosing b with a quasi-uniform distribution is not the same as choosing b^2 with a quasi-uniform distribution.]

[EDIT: Dr D: You said "P(a > b^2) = 0 means it cannot happen." You should know better than that! It doesn't mean it can't happen, it means that on a large number of trials the proportion in which it happens will tend to 0 as the number of trials tends to infinity. On a separate note, I don't have my sheet with me - did your figures for the Gaussian match what I posted below (asymptotically like 1 - k/√σ, where k ≈ 0.464)?]

As always, for this sort of thing, it depends on how precisely we intend to "choose numbers a and b randomly and uniformly from the entire set of real numbers", since there's no way to do that with the usual meaning of those terms. ;-) We'll have to take the limit of some finite and well-defined process.

To me, the question implies that we choose a and b in the same way and independently (not choosing b first and then deciding whether a is more or less than b^2, for example). There are many ways of doing this so that the limit process is something akin to what we're looking for, but I doubt very much that they'd give the same answer.

For instance, possibly the simplest case: suppose we pick a and b uniformly on [-x, x] where x > 1. Then the probability is

(1 / 4x^2) ∫(-√x to √x) (b^2+x) db + 2(x-√x) / 2x

= (1 / 4x^2) [b^3/3 + xb] [-√x to √x] + 1 - 1/√x

= 1 - 1 / (3√x)

and the limit is 1 as x -> ∞. This is expected since for any large range of magnitude, the squares occupy a much larger range. This doesn't apply to the unbounded range, of course, so this is not a compelling answer, at least for me.

I think that, rather than taking a uniform distribution and making it infinite in the limit, it's more useful to take an infinite distribution and make it uniform in the limit. Unfortunately this is not likely to lead to manageable integrals.

If we choose a and b from a Gaussian distribution with mean 0 and standard deviation σ, we get

∫(-∞ to ∞) 1/(σ√π) e^(-b^2/σ^2) . ∫(-∞ to b^2) 1/(σ√π) e^(-a^2/σ^2) da db

and if you can tell me what the limit of that is as σ -> ∞ analytically, I'm impressed. ;-) But some numerical work indicates that this is asymptotically like 1 - k/√σ, where k ≈ 0.464. So by this approach we also get 1 as a limit.

I don't have time to work out any other variants at present, but I feel comfortable supporting an answer of 1, as long as we treat a and b equally and independently. I think it would take a very bizarre finite process to have anything other than 1 in the limit under those conditions.

Of course, if we do not treat a and b equally and independently, all bets are off; you can pretty much rig it to get any answer you like without those restrictions.

[EDIT: I wanted to make this more precise, and it's a good followup to ksoileau as well.]

- ksoileau: if you allow an arbitrary pdf you can probably make the answer anything you like. Uniformity is an important condition here, even if not yet well-defined.

I would propose this:

Let {f_n: n ∈ Z} be a sequence of pdfs defined on the real line with the following properties:

- For any a, b ∈ R there is an N in Z such that f_n(x) > 0 for all n > N and all x ∈ [a, b]. (Thus, there is no part of the real line that is not covered by the pdf with non-zero density eventually.)

- For any a, b ∈ R, lim (n->∞) f_n(a) / f_n(b) = 1. (In other words, the distribution is what you might call asymptotically uniform.) Note that the prior condition is a useful one for building this, in particular knowing that for n large enough f_n(b) ≠ 0.

- For any x ∈ R, lim (n->∞) f_n(x) exists. (And must equal 0 by the second condition, else we can prove f_n is not a valid pdf for some large enough n.) I think this condition is probably redundant, but I thought I'd chuck it in to be on the safe side.

I conjecture that under these restrictions, if we let P(n) be the probability that real numbers a and b chosen independently according to the pdf f_n satisfy a < b^2, we will have lim (n->∞) P(n) = 1.

I'm not going to try to prove it. I do have a day job, after all! But it would be interesting to see if any other restrictions are required - continuity, for instance; I think the uniformity condition should get around that. (Interesting problem: can we prove that the given conditions require f_n to be continuous?) Can we prove that the third condition is redundant? Should I post a separate question (or series of questions) to ask these? ;-)

• I suspect my argument is not going to be well received. I say the probability is 50%

Let a ∈ℝ, Let b ∈ℝ and randomly select the values for a and b.

As already noted, for a ≤ 0, P( a < b²) = 1, this is trivial. Only slightly less trivial is the idea that P(a < 0 ) = 1/2 and thus

P( a < b² | a ≤ 0) = 1

and

P( a < b² ) ≥ 1/2

Now consider what happens when a > 0

For a > 0, while it is easy to show there is a non zero probability for a finite b, the limit, the probability is zero.

a < b² is equivalent to saying 0 < a < b², remember we are only looking at a > 0.

If this a finite interval on an infinite line. The probability that a is an element of this interval is zero.

P( a < b² | a > 0) = 0

As such we have a total probability

P( a < b² )

= P( a < b² | a ≤ 0) * P(a ≤ 0) + P( a < b² | a > 0) * P(a > 0)

= 1 * 1/2 + 0 * 1/2

= 1/2

Remember, this is because of the infinite sets. No matter what type of interval you draw on paper or on a computer you will find a finite probability that appears to approach 1. But this is due to the finite random number generators on the computer and if we had this question asked with finite values there would be a a solution greater than 50%.

I don't mean to be condescending, but please explain why using the Gaussian to approximate a uniform distribution is a good idea?

Aren't infinite numbers fun. Cantor when mad working with them! :)

................................

0.75

If a is negative, b² will always be greater. So you can establish the odds at 1.0 for 50% of the cases.

For the moment, lets assume a and b are positive.

Ok, for 0< a <10, 0< b <10, the odds of a <b² is

=1- √10/10 = 1-1/√10 ≈ .683

Now for for 0< a <100, 0< b <100 the odds of a <b² is

= 1-1/√100 =1/10 = .9

And you show that this quickly heads towards a probability of 1.

But is this the best way to count? No!

Remember the axiom that ∞² = ∞

So lets take any positive number b² (forget that square sign, it just mess you up a and b are independent, likewise, a and b² are independent) and any positive number a, what is the probablity that a - b² >0, ahh 0.5.

So the overall odds are .75.

I realize that this is completely counterintitive, but it is the right result.

.....................

Let me reiterate, this is a how you count problem, which seems to be the consensus.

If you indeed have a random, uniform distribution, a, b and b² are defacto infinite -- which is how I (and several others) derived my answer.

But if you presume some other distristibution such as a geometric or exponetial decay or gaussian distribution about 0, the results vary. But remember, for 0 <a, b²<1, the probablity favors a > b².

*I note that ksoulie's distribution pattern has the odds of 0 to 1 being the same as the odds of 1 to ∞ and used the geometric pattern to determine a and b advocated by most of the answerers to this question.

• There is a class of probability problems that call for random sampling of points on the infinite space, and many of them can actually give you different answers depending on how the sampling is done. For example, picking points from a square or a circle, and then letting either one increase infinitely in size in the limiting case. However, in this case, the answer is 0 regardless of what shape we begin with, and letting it increase infinitely large. To understand this "intuitively", consider the parabola y = x² from 0 < x < 1. Then the probability that a > b² is just the area above the parabola within the unit square. If we consider the parabola where 0 < x < k, as k increases above 1, we can rescale the parabola by dividing both x and y by k, so that within the unit square, the area above the parabola becomes that shrinking portion close to where x = 0. It does not matter if we deal with the unit square or the unit circle, or work the probabilities in terms of angular sectors, the fact is, when viewed "from infinity", the parabola becomes just a thin, vanishing sliver, and the probability drops to 0.

Ulitimately, the problem is moot, because it depends on how the points are sampled from the infinite plane of ordered pairs (a,b). It's just that for most common sense means of sampling, you end up with 0 probability that a > b².

Addendum: To make this a little easier to see, let's say that 0 < a,b < 1000. Then almost all of the time a < b². Why should this surprise anybody?

Addendum 2: Let me help you with an "intuitive" view. Consider an ordered pair (a,b) on the infinite plane. We'll ASSUME that the angle it makes, ArcTan(a/b), is uniformly random from 0 to π/2 (for reals only). Consider any such angle, and consider a sector with an infinitely small angular separation, originating from (0,0) and radiating outwards to infinity. The parabola y = x² will intersect this sector. All the points (a,b) in this sector between (0,0) and the parabola will satisfy the inequality a > b², while all those beyond it will satisfy the inequality a < b². It's not hard to "see" that for any angle ArcTan(a/b), the ratio of the areas is 0.

• Thanks to Math Guy I was able to see the original question (I repeat the link for convenience):

The original question is Problem #50 from the book Frederick Mosteller, Fifty Challenging Problems in Probability, Addison-Wesley, 1965 (in fact there are 56 problems in this excellent book) and the answer given is 1 - the same is to this question.

The solution in the book is to pick a random point (a, b), uniformly distributed in a square, large enough with a center at the origin, side 2x /-x ≤ a,b ≤ x/, exactly like what Scarlet Manuka has written above:

"For instance, possibly the simplest case: suppose we pick a and b uniformly on [-x, x] where x > 1. Then the probability is

(1 / 4x^2) ∫(-√x to √x) (b^2+x) db + 2(x-√x) / 2x

= (1 / 4x^2) [b^3/3 + xb] [-√x to √x] + 1 - 1/√x

= 1 - 1 / (3√x)

and the limit is 1 as x -> ∞".

The area ratio of the paraboloidal segment to that of a square is 1/(3√x) indeed and the result follows easily.

When I read this book for the first time some 35 years ago, I was curious to see what's going on if the random point is picked in a circle with radius R, centered in the origin, instead of a square. Some integration work, followed by limit as R→∞ yielded the same probability 1 - it's intuitively clear that the angle between both rays, connecting the origin with the intersecting points decreases as R→∞ /the parabola "shrinks" to its axis, being the only non-degenerated conic with 1 "infinite point"/, so the ratio of the common area of the parabola and the circle to the area of the circle tends to 0.

EDIT (after having read the additional details: in the book there is a picture like this:

http://farm3.static.flickr.com/2062/2321734114_86b...

The random point (b, a) to be uniformly distributed in the square means its density is a constant in the square [-x, x; -x, x] and 0 otherwise. This interpretation leads to the necessity to find the limit of the ratio of both areas as Scarlet Manuka has done above. As far as Your additional question is concerned:

P(a < b²) = 1 - P( not (a < b²)) = 1 - P(a ≥ b²) =

= 1 - P(a > b²) since P(a = b²) = 0 /the parabola itself is a point set with 0 measure/.

EDIT 2: Dr D, why do You find P(a > b²) = 0 counter intuitive? Think of the ratio of the "areas" of the domain, enclosed by the parabola and the entire plane /of course You understand what I mean using quotation marks, the things can be put in a 100% precise manner/. Think of the parabola as a boundary case of an ellipse when one of its focuses moves infinitely up the a-axis on the picture, or, using the "infinite elements" approach in the analytic geometry, the both branches of the parabola are tending to "join together in the only infinite point". Presented in this manner the parabolic domain has "infinite dimension" in one direction only, unlike the entire plane - like a circle, expanding infinitely in all directions as, let's remind what they say about our Universe after the Big Bang.

FINAL EDIT: Math Guy is right: we need a consensus how to understand "randomly and uniformly from the entire (!) set of real numbers". In every source, in every book with similar problems I have read, the general approach has been the following: a random variable (evnt. multidimensional), uniformly distributed in a sufficiently large domain is considered, then the domain is expanded infinitely. I have NEVER encountered another approach like Scarlet Manuka's suggestion above: "I think that, rather than taking a uniform distribution and making it infinite in the limit, it's more useful to take an infinite distribution and make it uniform in the limit" If we agree on that approach (well, I find this most natural of all, other approaches have significant drawbacks) and take a square [-x, x; -x, x], then the joint density of the random point (b, a) will be:

F(v, u) = 1/(4x²) if (b, a) belongs to [-x, x; -x, x], or 0 otherwise.

If a is uniformly distributed in [-x, x], its density would be

f(u) = 1/(2x) if -x ≤ a ≤ x, or 0 otherwise.

Similarly for b uniformly distributed in [-x, x]:

g(v) = 1/(2x) if -x ≤ a ≤ x, or 0 otherwise.

By the way F(v, u) = g(v)*f(u), so a and b are independent.

Then in [-x, x; -x, x] we'll have /G - gray domain on the picture/:

P(a > b²) = ∫∫ [G] F(v, u) dv du = 1/(4x²) ∫∫ [G] dv du =

= Area(G)/(4x²) = (4/3) x√x / (4x²) = 1/(3√x),

and, IF WE HAVE CONSENSUS on the subject, for a and b, randomly and uniformly chosen from the ENTIRE set of real numbers:

P(a > b²) = 0.

• I assume that this is to follow up on the question

I stand by my answer that it is 50%, as, no matter what b^2 is, there are an equal number of choices for a, such that a < b^2 as there are for a > b^2, albeit, there are an infinite number of each.

Once you start to restrict the set of real numbers (as you did in your answer), to some set [-n, n] then the probability changes to the respective lengths of the segments [-n, b^2] and [b^2, n].

Food for thought.

EDIT

I've been looking at the probability by choosing b, and then seeing what happens with a, but, while I don't see anything wrong with that approach, the opposite approach leads me to have doubt.

If we choose a, and then look for when b^2 > a, then we find that there are bounds on what b could be.

If a<0, then a<b^2 100% of the time.

If a>0, then for b^2>a, then be must not be between-sqrt(a) and sqrt(a). This puts a limited section of the number line which would result in a<b^2. This looks like the probabilty of having a<b^2 to be rather high. However, based on cardinality, one section of the number line is just as infinite as the entire thing, so it could decrease down towards 50%. But that is only for this case where a>0. When combined with the other case (a<0), if we average the two cases (they are equally likely to occur, right?), then we would get a probability of 75%.

I cannot, however, accept this answer as definative, because I don't see any flaw with the logic that I first presented. This second bit of logic seems to get bogged down in the details, and makes some rather large jumps, while the first is quite simple (too simple?) and straight forward. If anyone could present a complete argument as to why the answer is anything other than 50%, I would be interested in putting my mind to rest about this problem!

whitesox, you have a very compelling arguement.

Edit (again)

I think that scarlet has the key to this. There is no such thing as a uniform distribution on the entire real number line, and so you cannot randomly choose a real number without first choosing what is meant by randomly choosing one. Based on that definition, you can then compute various differing probabilities.

Another comment:

Yes, it is counter-intuitive that P(a>b^2) = 0, because we can see that it is NOT 0. However, the LIMIT of P(a>b^2), as a and b are randomly chosen on the closed interval [-n, n] as n --> infinity, is 0. It isn't zero in any of these closed intervals, but as the intervals get larger, the probability decreases until it is neglegible.

So the P(a<b^2) is ENTIRELY dependent upon how you define "randomly and uniformly from the entire set of real numbers" because, as someone else pointed out below, that is not a possible thing to do. (After all, if it were possible, then we could logically deduce that P(a<b^2) = 50%, 75%, and a list of other things, as you can see here.)

GREAT QUESTION!!!

• This problem has no unambiguous solution. We can plot A, B, C in a 3D space, and let a surface be defined where B = 2√(AB). If we limited the range of A, B, C, such as 0 > A, B, C > k, where k is some constant, then we'd have a cubical volume divided into two parts by this surface, the volume under 2√(AB) being 8/9 k³, so that the odds that B > 2√(AB) would be 1/9. However, we would get a different result if we insisted that √(A²+B²+C²) = r for some constant r. There is no "shape to the range of A, B, C" at infinity, so this problem has no answer without some constraint on the range of A, B, C, and that answer would depend on the kind of constraint imposed. Addendum: A problem similiar to this is the classic: Three points are picked at random on a plane. What's the probability that they form the vertices of an obtuse triangle? Lewis Carroll suggested a solution, but his solution was later repudiated, and a different answer was offered. It's one of those very rare and interesting cases where a mathematical finding moves into the realm of philosophy and not unambiguous fact. See link for an interesting paper on this, which should be relevant to your problem.

• Anonymous

It's possible to randomly choose a real number, but impossible to do it uniformly, if uniformly means this:

"The probability that the number chosen lies in an interval of length L is proportional to L."

However, it's easy to randomly choose a real number. One way is to take the tangent of Pi/2*(2*U-1), where U is the uniform random variable taking values in (0,1).

Assuming a and b are independent and distributed in this way, I get exactly 3/4 as the probability that a<b^2.

jeredwm: I was responding to the OP, pointing out that it's important to carefully define what is meant by choosing numbers "randomly and uniformly". How would you define "uniformly" in this case?

In the general case,

P(a<b^2)

=int(int(f(x,y),x=-infinity..y^2),

y=-infinity..infinity)

where f is the joint density of a and b respectively.

• I'm going to have to think about this one a little more, but here's what I have so far:

Case 1:

given: a negative, b negative

P(given) = 0.25

P(a < b | given) = 0.5

P(a < b^2 | a<b & given) = 1

Total contribution: 0.25 * 0.5 * 1 = 0.125

P(a > b | given) = 0.5

P(a < b^2 | a>b & a neg, b neg) = 1

Total contribution: 0.25 * 0.5 * 1 = 0.125

Contribution from case 1: 0.25

Case 2:

given: a positive, b positive

P(given) = 0.25

P(a < b | given) = 0.5

P(a < b^2 | a<b & given) = 1

Total contribution: 0.25 * 0.5 * 1 = 0.125

P(a > b | given) = 0.5

P(a < b^2 | a>b & given) = ????

Total contribution: ????

Contribution from case 2: 0.125 + ????

Case 3:

given: a positive, b negative

P(given) = 0.25

P(a < b | given) = 0

Total contribution: 0

P(a > b | given) = 1

P(a < b^2 | a > b & given) = ????

Total contribution: ????

Contribution from case 3: ????

Case 4:

given: a negative, b positive

P(given) = 0.25

P(a < b | given) = 1

P(a < b^2 | a < b & given) = 1

Total contribution: 0.25

P(a > b | given) = 0

Total contribution: 0

Contribution from case 4: 0.25

So far we know that:

0.625 < P(b^2 > a) < 0.875

I'd make an educated guess and say it's somewhere around the middle of that range (incorporating the thought that the missing probabilities are most likely equal to each other) - so about 0.75.

------------------

Edit: I've changed the cases around a bit.

------------------

Edit: Scarlet: Yeah....it was really late. It should say:

0.625 < P(b^2 > a) ≤ 1

But now that people are saying you can't choose from the real number line uniformly (which makes sense, given some thought), I'd probably change my answer to "I have no fricken idea".

------------------

Edit: Dr. D, you have a quite a following. I'm really surprised that this question drew more attention than the question that spawned it. I find that other question quite fascinating.

• See my response in

It's utterly nonsensical to speak of a uniform random variable over the entire real line, so the answer may be anything you please.

§

** RESPONSE to Ksoileau: **

I'm not sure whom you were addressing, but notice that the original question states "uniformly," my answer says "uniform," and, so far as I know, every other answer discussed or implied uniformity as well...

*** FOLLOW-UP to Ksoileau: ***

Ah, okay... Which is why I said what I said above. :-)

*** COMMENT to Dr. D: ***

Probability of 0 does NOT mean "never happens." It means the integral of the PDF over the subset in question is 0.

For example, if x is randomly and uniformly distributed over [0,1], the probability of choosing a rational is 0; in fact, the probability of choosing any algebraic number is 0. This doesn't mean, of course, that it would "never" happen...