do not ignore explicitly given mantissa width #868

mnieper · 2024-08-28T16:26:44Z

This fixes the issue raised in #866, making mantissa widths meaningful.

mflatt · 2024-09-22T20:58:32Z

I haven't yet understand the function completely, but it looks rounder function needs to be a little different to defend against very large bit widths. With this patch, 1|100000000000000000000000000000 runs out of memory, but that reads as 1.0 in current Chez Scheme.

mnieper · 2024-09-22T21:31:25Z

I haven't yet understand the function completely, but it looks rounder function needs to be a little different to defend against very large bit widths. With this patch, 1|100000000000000000000000000000 runs out of memory, but that reads as 1.0 in current Chez Scheme.

Chez also runs out of memory with #e0.1|10000000000000. That is somewhat unavoidable by the definition of ...|.... That number represents an exact rational number that is the best binary floating-point approximation to 0.1 with 100...000 significant binary digits, and that exact number has a huge numerator and denominator. The only way out here would be to raise an &implementation-restriction for large mantissa widths. Maybe that is the best course of action as mantissa widths in practice are much smaller than these huge numbers.

Your example, on the other hand, is an inexact number, so the final result won't take much space. The question is whether the calculation has to take that much space, especially in the less trivial case 0.1|1000000000000000000000. I would like to get the result of (inexact #e0.1|10000000000000000), which is the best 53-bit approximation to the best 10000000000000-bit approximation to 1/10.

mflatt · 2024-09-22T22:28:22Z

Chez also runs out of memory with #e0.1|10000000000000.

Here's what I'm seeing:

Chez Scheme Version 10.1.0-pre-release.2
Copyright 1984-2024 Cisco Systems, Inc.

> #e0.1|10000000000000
1/10

I don't see why there's inherently a problem here. As long as the number written before the | has a tractable number of digits, any request requested additional precision is just zeros, right?

mnieper · 2024-09-23T05:56:47Z

(Please excuse the delayed answer; it was night here.)

Have you tried #e0.1|100...000 with the (most recent version of the) patch I provided here? For example, I get

> #e0.1|100
1014120480182583521197362564301/10141204801825835211973625643008

and much larger denominators for higher mantissa widths.

It would be wrong to truncate the precision. The reason is that 1/10 has a period of length 4 in binary representation. In fact,

1/10 = 0.00011001100110011001100...

in binary representation. From this, we can deduce that

#e0.1|1 = 0.001
#e0.1|2 = 0.00011
#e0.1|3 = 0.00011
#e0.1|4 = 0.0001101
#e0.1|5 = 0.00011011
...

in binary representation, where I rounded to even. This means that larger and larger denominators (all powers of two) are needed the larger the mantissa width is.

mflatt · 2024-09-23T12:29:08Z

Sorry, I misunderstood what you meant by "Chez also", and i was confused about fractions and binary representations. Thank you for the tutorial! It makes sense that #e0.1|10000000000000 runs out of memory.

It still seems like 0.1|10000000000000 should not run out of memory. Is it a matter of setting a ceiling on precision when working toward for an inexact result, or is it more complex than that?

mflatt · 2024-09-23T12:49:00Z

Also, the results of #e1|10000000000000 and #e0.25|10000000000000 fit comfortably into memory, so it seems like they should be allowed, too. Is that a matter of detecting a power-of-two denominator, or are the cases when the result is representable more complicated to characterize?

mnieper · 2024-09-23T12:53:59Z

Sorry, I misunderstood what you meant by "Chez also"

Oh, I see. Indeed, what I wrote wasn't very clear.

It still seems like 0.1|10000000000000 should not run out of memory. Is it a matter of setting a ceiling on precision when working toward for an inexact result, or is it more complex than that?

Consider the following number in binary notation (where N* means to repeat the following binary digit N times):

0.1 52*0 1 N*0 1

Let x be its decimal representation.

If N is sufficiently larger than p and q sufficiently larger than N, we have that #x|p is 0.1 in binary and #x|q is 0.1 51*0 1 in binary. In other words, we cannot simply truncate a huge mantissa width like q to some smaller one p without examining the value of x.

mnieper · 2024-09-23T13:03:58Z

Also, the results of #e1|10000000000000 and #e0.25|10000000000000 fit comfortably into memory, so it seems like they should be allowed, too. Is that a matter of detecting a power-of-two denominator, or are the cases when the result is representable more complicated to characterize?

I do not have a full characterisation. But a denominator N means that the quotient has a period length of at most N - 1, so one should be able to reduce the case of an arbitrary mantissa width roughly to the case of a mantissa width <= 2*N.

But I wonder whether it makes sense to spend the time getting the details right and writing extra code for huge mantissa widths. In practice, the largest mantissa widths may come from when using libraries like GNU MPFR. Do you think someone would use floats that use megabytes of memory?

mnieper · 2024-09-23T13:22:13Z

Maybe it is not that complicated to actually implement mantissa truncation.

For exact numbers, this is only possible when the denominator is a power of two because otherwise, we have an infinite period in the binary representation. When the denominator is a power of two, we can truncate at the log2 of the denominator.
For inexact numbers, we calculate the period length P according to the denominator. We then add 53 and the log2 of the numerator, getting some number N. If the mantissa width p is then > N + P, we can replace it with p - P.

The estimates can possibly be off by 1 or 2, but one can code safely.

mflatt · 2024-09-23T13:44:14Z

That sounds really great. As we've established, I'm not clear on the math, but it certainly sounds plausible.

My experience with Scheme numbers is that these details end up being worthwhile, even though it means extra code, and even through the happy spaces often end up being complex (e.g., only power-of-two denominators). Unfortunately, my experience with Scheme numbers is also that I have to learn a lot of new things, and then I forget them soon afterward!

Avoiding running out of memory for a very large precision request when the number with adjusted precision should take about as much memory as the number without an adjustment.

mflatt · 2024-09-29T15:23:42Z

Hi @mnieper — I pushed a commit to add precision bounding in (I think) the way you describe. Does it look right? Do I understand correctly that this captures all of the cases where the end result number can be represented with about the same amount of memory as the number without a precision adjustment?

mnieper · 2024-09-29T15:30:33Z

Thanks a lot, Matthew. I am going to take a look at it within the next few days. (I wanted to have come up with some code as well but haven't found the time.)

do not ignore explicitly given mantissa width

763ed3d

mnieper force-pushed the issue-866 branch from a7c2249 to 763ed3d Compare September 5, 2024 15:53

adjust | parsing for very large precision requests

c56c21a

Avoiding running out of memory for a very large precision request when the number with adjusted precision should take about as much memory as the number without an adjustment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

do not ignore explicitly given mantissa width #868

do not ignore explicitly given mantissa width #868

mnieper commented Aug 28, 2024

mflatt commented Sep 22, 2024

mnieper commented Sep 22, 2024

mflatt commented Sep 22, 2024 •

edited

Loading

mnieper commented Sep 23, 2024

mflatt commented Sep 23, 2024

mflatt commented Sep 23, 2024

mnieper commented Sep 23, 2024

mnieper commented Sep 23, 2024

mnieper commented Sep 23, 2024

mflatt commented Sep 23, 2024

mflatt commented Sep 29, 2024

mnieper commented Sep 29, 2024

do not ignore explicitly given mantissa width #868

Are you sure you want to change the base?

do not ignore explicitly given mantissa width #868

Conversation

mnieper commented Aug 28, 2024

mflatt commented Sep 22, 2024

mnieper commented Sep 22, 2024

mflatt commented Sep 22, 2024 • edited Loading

mnieper commented Sep 23, 2024

mflatt commented Sep 23, 2024

mflatt commented Sep 23, 2024

mnieper commented Sep 23, 2024

mnieper commented Sep 23, 2024

mnieper commented Sep 23, 2024

mflatt commented Sep 23, 2024

mflatt commented Sep 29, 2024

mnieper commented Sep 29, 2024

mflatt commented Sep 22, 2024 •

edited

Loading