FAQ
I'm proud to announce that the PEP for Decimal Data Type is now published
under the python.org structure:

http://www.python.org/peps/pep-0327.html

This wouldn't has been possible without the help from Alex Martelli, Aahz,
Tim Peters, David Goodger and c.l.p itself.

After the pre-PEP roundups the features are almost established. There is not
agreement yet on how to create a Decimal from a float, in both explicit and
implicit constructions.

I depend on settle that to finish the test cases and actually start to work
on the code.

I'll apreciate any feedback. Thank you all in advance.

. Facundo

## Search Discussions

•  at Jan 30, 2004 at 3:06 pm ⇧

Facundo Batista writes:
I'm proud to announce that the PEP for Decimal Data Type is now published
http://www.python.org/peps/pep-0327.html
VERY nice work here.

Here's my 2 cents:

(1) You propose conversion from floats via:
Decimal(1.1, 2) == Decimal('1.1')
Decimal(1.1, 16) == Decimal('1.1000000000000001')
Decimal(1.1) == Decimal('110000000000000008881784197001252...e-51')

I think that we'd do even better to ommit the second use. People who
really want to convert floats exactly can easily write "Decimal(1.1, 60)". But
hardly anyone wants to convert floats exactly, while lots of newbies would
forget to include the second parameter. I'd say just make Decimal(someFloat)
raise a TypeError with a helpful message about how you need that second
parameter when using floats.

(2) For adding a Decimal and a float, you write:
I propose to allow the interaction with float, making an exact conversion and
raising ValueError if exceeds the precision in the current context (this is
maybe too tricky, because for example with a precision of 9, Decimal(35) + 1.2
is OK but Decimal(35) + 1.1 raises an error).

I suppose that would be all right, but I think I'm with Aahz on this
one... require explicit conversion. It prevents newbie errors, and non-newbies
can provide the functionality extremely easily. Also, we can always change our
minds to allow addition with floats if we initially release with that raising
an exception. But if we ever release a version of Python where Decimal and
float can be added, we'll be stuck supporting it forever.

Really, that's all I came up with. This is great, and I'm looking forward to
using it. I would, though, be interested in a couple more syntax-related
details:
(a) What's the syntax for changing the context? I'd think we'd want
a "pushDecimalContext()" and "popDecimalContext()" sort of approach, since most
well-behaved routines will want to restore their caller's context.
(b) How about querying to determine a thread's current context? I don't
have any use cases, but it would seem peculiar not to provide it.
(c) Given a Decimal object, is there a straightforward way to determine its
coefficient and exponent? Methods named .precision() and .exponent() might do
the trick.

-- Michael Chermside
•  at Jan 30, 2004 at 4:03 pm ⇧

On Fri, 30 Jan 2004 09:49:05 -0300, "Batista, Facundo" wrote:

I'll apreciate any feedback. Thank you all in advance.
My concern is that many people will use a decimal type just because it
is there, without any consideration of whether they actually need it.

95% of the time or more, all you need to do to represent money is to
use an integer and select appropriate units (pence rather than pounds,
cents rather than dollars, etc) so that the decimal point is just a
presentation issue when the value is printed/displayed but is never
needed in the internal representation.

That said, there are cases where a decimal type would be genuinely
useful. Given that, my only comment on the PEP is that a decimal
literal might be a good idea - identical to float literals but with a
'D' appended, for instance.

I wouldn't mention it now, seeing it as an issue for after the library
itself has matured and been proven, except for the issue of implicit
conversions. Having a decimal literal would add another class of
errors with implicit conversions - it would be very easy to forget the
'D' on the end of a literal, and to get an imprecise float implicitly
converted to decimal rather than the precise decimal literal that was
intended.

I don't know what the solution should be, but I do think it needs to
be considered.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Jan 30, 2004 at 6:12 pm ⇧

On Fri, 30 Jan 2004 16:03:55 +0000, Stephen Horne wrote: [snip]
That said, there are cases where a decimal type would be genuinely
useful. Given that, my only comment on the PEP is that a decimal
literal might be a good idea - identical to float literals but with a
'D' appended, for instance.
Or, maybe if money is being represented, appending a '\$'?

*ducks*

Just-couldn't-resist-ly yours,

--
Christopher
•  at Jan 31, 2004 at 3:01 am ⇧

Stephen Horne wrote:
I don't know what the solution should be, but I do think it needs to
be considered.
The C and C++ people have agreed. The next standards for those
languages, whenever they come out, are supposed to include decimal
floating point as a standard data type. The number of decimal
places required is also profuse, something up around 25-30 places,
more than current hardware, eg IBM mainframes, supports.

If python adds decimal data, it probably ought to be consistent with C
and C++. Otherwise, the C and C++ guys will have a dreadful time
writing emulation code to run on computers built to support python.

Al
•  at Jan 31, 2004 at 3:23 am ⇧

If python adds decimal data, it probably ought to be consistent with C
and C++. Otherwise, the C and C++ guys will have a dreadful time
writing emulation code to run on computers built to support python.
Now that's a "Python will take over the world" statement if I ever heard
one. But seriously, processor manufacturers build processors and
compilers for Fortran, C, and C++. If a manufacturer starts paying
attention to where Python is going (for things other than scripting
their build-process), I'm sure Guido would like to know.

- Josiah
•  at Feb 5, 2004 at 2:09 pm ⇧

In article <401B1A87.639CD971 at easystreet.com>, wrote:
If python adds decimal data, it probably ought to be consistent with C
and C++. Otherwise, the C and C++ guys will have a dreadful time
writing emulation code to run on computers built to support python.
Read the PEP; Python's proposed decimal type is based on the existing
decimal standard. If C/C++ *don't* follow the standard, that's their
problem. BTW, Java uses the standard.
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
•  at Feb 5, 2004 at 2:16 pm ⇧
In article <6ltk10h30riel0lghd18t5unjco2g26spi at 4ax.com>,
Stephen Horne wrote:
On Fri, 30 Jan 2004 09:49:05 -0300, "Batista, Facundo"
wrote:
I'll apreciate any feedback. Thank you all in advance.
My concern is that many people will use a decimal type just because it
is there, without any consideration of whether they actually need it.

95% of the time or more, all you need to do to represent money is to
use an integer and select appropriate units (pence rather than pounds,
cents rather than dollars, etc) so that the decimal point is just a
presentation issue when the value is printed/displayed but is never
needed in the internal representation.
The problem lies precisely in that representation. For starters, a
binary integer is O(n^2) for conversion to decimal printing. Then
there's the question about multi-currency conversions, or interest
rates, or ....
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
•  at Feb 6, 2004 at 2:49 am ⇧

On 5 Feb 2004 09:16:51 -0500, aahz at pythoncraft.com (Aahz) wrote:
In article <6ltk10h30riel0lghd18t5unjco2g26spi at 4ax.com>,
Stephen Horne wrote:
On Fri, 30 Jan 2004 09:49:05 -0300, "Batista, Facundo"
wrote:
I'll apreciate any feedback. Thank you all in advance.
My concern is that many people will use a decimal type just because it
is there, without any consideration of whether they actually need it.

95% of the time or more, all you need to do to represent money is to
use an integer and select appropriate units (pence rather than pounds,
cents rather than dollars, etc) so that the decimal point is just a
presentation issue when the value is printed/displayed but is never
needed in the internal representation.
The problem lies precisely in that representation. For starters, a
binary integer is O(n^2) for conversion to decimal printing.
In practice, there is an upper limit to the size of number that occurs
in any financial use, and of course we are not talking about tens of
digits let alone hundreds, meaning that the conversion is most
sensibly treated as O(1) for each number converted.

Anyway, speeding up the presentation of results makes little sense if
you slow down all the arithmetic operations to do it.
Then
there's the question about multi-currency conversions, or interest
rates, or ....
Admittedly needing better than penny precision, but still fixed
precision (ie suiting an integer representation with an implicit scale
factor) and the results are rounded.

I work with a company that writes accounting software. We don't need
to worry about currency conversions, but we do need to worry about
interest and other cases where fractional pennies seem to be implied
(rates for taxes, allowances etc) and basically the fractional pennies
are never really an issue - you do have to be careful with the
rounding rules, but that applies whatever representation you use.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 6, 2004 at 6:56 pm ⇧

On 5 Feb 2004 09:16:51 -0500, aahz at pythoncraft.com (Aahz) wrote:
In article <6ltk10h30riel0lghd18t5unjco2g26spi at 4ax.com>,
Stephen Horne wrote:
On Fri, 30 Jan 2004 09:49:05 -0300, "Batista, Facundo"
wrote:
I'll apreciate any feedback. Thank you all in advance.
My concern is that many people will use a decimal type just because it
is there, without any consideration of whether they actually need it.

95% of the time or more, all you need to do to represent money is to
use an integer and select appropriate units (pence rather than pounds,
cents rather than dollars, etc) so that the decimal point is just a
presentation issue when the value is printed/displayed but is never
needed in the internal representation.
The problem lies precisely in that representation. For starters, a
binary integer is O(n^2) for conversion to decimal printing. Then

Regards,
Bengt Richter
•  at Feb 6, 2004 at 7:50 pm ⇧

On 5 Feb 2004 09:16:51 -0500, aahz at pythoncraft.com (Aahz) wrote:
The problem lies precisely in that representation. For starters, a
binary integer is O(n^2) for conversion to decimal printing. Then
On Fri, Feb 06, 2004 at 06:56:03PM +0000, Bengt Richter wrote:
"n" is the number of digits in the number, in this case.

A standard way to convert to base 10 looks like this:
def base10(i):
digits = []
while i:
i, b = divmod(i, 10)
digits.append(b)
digits.reverse()
return digits
Each divmod() takes from O(n) down to O(1) (O(log i) for each successive
value of i), and the loop runs n times (i is shortened by one digit each
time). This is a typical n^2 algorithm, much like bubble sort where the
outer loop runs n times and an inner loop runs 1-to-n times.

Jeff
•  at Jan 30, 2004 at 6:21 pm ⇧
Stephen Horne wrote:

#- My concern is that many people will use a decimal type just
#- because it
#- is there, without any consideration of whether they actually need it.

Speed considerations are raised. You'll *never* get the performance of using
floats or ints (unless you have a coprocessor that handles this).

#- I don't know what the solution should be, but I do think it needs to
#- be considered.

(In my dreams) I want to "float" to be decimal. Always. No more binary.
Maybe in ten years the machines will be as fast as is needed to make this
posible. Or it'll be implemented in hardware.

Anyway, until then I'm happy having decimal floating point as a module.

. Facundo
•  at Jan 30, 2004 at 8:30 pm ⇧

(In my dreams) I want to "float" to be decimal. Always. No more binary.
Maybe in ten years the machines will be as fast as is needed to make this
posible. Or it'll be implemented in hardware.

Anyway, until then I'm happy having decimal floating point as a module.

In my dreams, data is optimally represented in base e, and every number
is represented with a roughly equivalent amount of fudge-factor (except
for linear combinations of the powers of e).

Heh, thankfully my dreams haven't come to fuition.

While decimal storage is useful for people and money, it is arbitrarily
limiting. Perhaps a generalized BaseN module is called for. People
could then generate floating point numbers in any base (up to perhaps
base 36, [1-9a-z]). At that point, having a Money version is just a
specific subclass of BaseN floating point.

Of course then you have the same problem with doing math on two
different bases as with doing math on rational numbers. Personally, I
would more favor a generalized BaseN class than just a single Base10 class.

- Josiah
•  at Jan 31, 2004 at 9:01 am ⇧
Josiah Carlson <jcarlson at nospam.uci.edu> wrote in message news:<bvef14\$919\$1 at news.service.uci.edu>...
(In my dreams) I want to "float" to be decimal. Always. No more binary.
I disagree.

My reasons for this have to do with the real-life meaning of figures
with decimal points. I can say that I have \$1.80 in change on my
desk, and I can say that I am 1.80 meters tall. But the two 1.80's
have fundamentally different meanings.

For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.

On the other hand, when I say that I am 1.80 m tall, it doesn't imply
that humans height comes in discrete packets of 0.01 m. It means that
I'm *somewhere* between 1.795 and 1.805 m tall, depending on my
posture and the time of day, and "1.80" is just a convenient
approximation. And it wouldn't be inaccurate to express my height as
0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because
these are within the tolerance of the measurement. So number base
doesn't matter here.

But even if the number base of a measurement doesn't matter, precision
and speed of calculations often does. And on digital computers,
non-binary arithmetic is inherently imprecise and slow. Imprecise
because register bits are limited and decimal storage wastes them.
(For example, representing the integer 999 999 999 requires 36 bits in
BCD but only 30 bits in binary. Also, for floating point, only binary
allows the precision-gaining "hidden bit" trick.) Slow because
decimal requires more complex hardware. (For example, a BCD adder has
more than twice as many gates as a binary adder.)
In my dreams, data is optimally represented in base e, and every number
is represented with a roughly equivalent amount of fudge-factor (except
for linear combinations of the powers of e).

Heh, thankfully my dreams haven't come to fuition.
Perhaps we'll have an efficient inplementation within the next
102.1120... years or so ;-)
While decimal storage is useful for...money
Out of curiosity: Is there much demand for decimal floating point in
places that have fractionless currecy like Japanese Yen?
Perhaps a generalized BaseN module is called for. People
could then generate floating point numbers in any base (up to perhaps
base 36, [1-9a-z]).
If you're going to allow exact representation of multiples of 1/2,
1/3, 1/4, ..., 1/36, 1/49, 1/64, 1/81, 1/100, 1/121, 1/125, 1/128,
1/144, etc., I see no reason not to have exact representations of
*all* rational numbers. Especially considering that rationals are
much easier to implement. (See below.)
... Of course then you have the same problem with doing math on two
different bases as with doing math on rational numbers.
Actually, the problem is even worse.

Like rationals, BaseN numbers have the problem that there are multiple
representations for the same number (e.g., 1/2=6/12, and 0.1 (2) = 0.6
(12)). But rationals at least have a standardized normalization. We
agree can agree that 1/2 should be represented as 1/2 and not
-131/-262, but should BaseN('0.1', base=2) + BaseN('0.1', base=4) be
BaseN('0.11', 2) or BaseN('0.3', 4)?

The same potential problem exists with ints, but Python (and afaik,
everything else) avoids it by internally storing everything in binary
and not keeping track of its representation. This is why "print 0x68"
produces the same output as "print 104". BaseN would violate this
separation between numbers and their notation, and imho that would
create a lot more problems than it solves.

Including the problem that mixed-based arithmetic will require:
* approximating at least one of the numbers, in which case there's no
* finding a "least common base", but what if that base is greater than
36 (or 62 if lowercase digits are distinguished from uppercase ones)?
•  at Jan 31, 2004 at 10:45 am ⇧

On 31 Jan 2004 01:01:41 -0800, danb_83 at yahoo.com (Dan Bishop) wrote:
I disagree. <snip>
But even if the number base of a measurement doesn't matter, precision
and speed of calculations often does. And on digital computers,
non-binary arithmetic is inherently imprecise and slow. Imprecise
because register bits are limited and decimal storage wastes them.
(For example, representing the integer 999 999 999 requires 36 bits in
BCD but only 30 bits in binary. Also, for floating point, only binary
allows the precision-gaining "hidden bit" trick.) Slow because
decimal requires more complex hardware. (For example, a BCD adder has
more than twice as many gates as a binary adder.)
I think BSD is a slightly unfair comparison. The efficiency of packing
decimal digits into binary integers increases as the size of each
packed group of digits increases. For example, while 8 BCD digits
requires 32 bits those 32 bits can encode 9 decimal digits, and while
16 BCD digits requires 64 bits, those digits can encode 19 decimal
digits.

The principal is correct, though - binary is 'natural' for computers
where decimal is more natural for people, so decimal representations
will be relatively inefficient even with hardware support. Low
precision because a mantissa with the same number of bits can only
represent a smaller range of values. Slow (or expensive) because of
the relative complexity of handling decimal using binary logic.
Perhaps a generalized BaseN module is called for. People
could then generate floating point numbers in any base (up to perhaps
base 36, [1-9a-z]).
<snip>
... Of course then you have the same problem with doing math on two
different bases as with doing math on rational numbers.
Actually, the problem is even worse.

Like rationals, BaseN numbers have the problem that there are multiple
representations for the same number (e.g., 1/2=6/12, and 0.1 (2) = 0.6
(12)). But rationals at least have a standardized normalization. We
agree can agree that 1/2 should be represented as 1/2 and not
-131/-262, but should BaseN('0.1', base=2) + BaseN('0.1', base=4) be
BaseN('0.11', 2) or BaseN('0.3', 4)?
I don't see the point of supporting all bases. The main ones are of
course base 2, 8, 10 and 16. And of course base 8 and 16
representations map directly to base 2 representations anyway - that
is why they get used in the first place.

If I were supporting loads of bases (and that is a big 'if') I would
take an approach where each base type directly supported arithmetic
with itself only. Each base would be imported separately and be
implemented using code optimised for that base, so that the base
wouldn't need to be maintained by - for instance - a member of the
class. There would be a way to convert between bases, but that would
be the limit of the interaction.

If I needed more than that, I'd use a rational type - I speak from
experience as I set out to write a base N float library for C++ once
upon a time and ended up writing a rational instead. A rational, BTW,
isn't too bad to get working but that's as far as I got - doing it
well would probably take a lot of work. And if getting Base N floats
working was harder than for rationals, getting them to work well would
probably be an order of magnitude harder - for no real benefit to 99%
or more of users.

Just because a thing can be done, that doesn't make it worth doing.
but what if that base is greater than
36 (or 62 if lowercase digits are distinguished from uppercase ones)?
For theoretical use, converting to a list of integers - one integer
representing each 'digit' - would probably work. If there is a real
application, that is.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Jan 31, 2004 at 5:35 pm ⇧

If I needed more than that, I'd use a rational type - I speak from
experience as I set out to write a base N float library for C++ once
upon a time and ended up writing a rational instead. A rational, BTW,
isn't too bad to get working but that's as far as I got - doing it
well would probably take a lot of work. And if getting Base N floats
working was harder than for rationals, getting them to work well would
probably be an order of magnitude harder - for no real benefit to 99%
or more of users.
I also wrote a rational type (last summer). It took around 45 minutes.
Floating point takes a bit longer to get right.
Just because a thing can be done, that doesn't make it worth doing.
Indeed :)

- Josiah
•  at Jan 31, 2004 at 7:33 pm ⇧

On Sat, 31 Jan 2004 09:35:09 -0800, Josiah Carlson wrote:
If I needed more than that, I'd use a rational type - I speak from
experience as I set out to write a base N float library for C++ once
upon a time and ended up writing a rational instead. A rational, BTW,
isn't too bad to get working but that's as far as I got - doing it
well would probably take a lot of work. And if getting Base N floats
working was harder than for rationals, getting them to work well would
probably be an order of magnitude harder - for no real benefit to 99%
or more of users.
I also wrote a rational type (last summer). It took around 45 minutes.
Floating point takes a bit longer to get right.
Was your implementation the 'not too bad to get working' or the 'doing
it well'?

For instance, there is the greatest common divisor that you need for
normalising the rationals.

I used the Euclidean algorithm for the GCD. Not too bad, certainly
better than using prime factorisation, but as I understand it doing
the job well means using a better algorithm for this - though I never
did bother looking up the details.

Actually, as far as I remember, just doing the arbitrary length
integer division functions took me more than your 45 minutes. The long
division algorithm is simple in principle, but I seem to remember
messing up the decision of how many bits to shift the divisor after a
subtraction. Of course in Python, that's already done.

Maybe I was just having a bad day. Maybe I remember it worse than it
really was. Still, 45 minutes doesn't seem too realistic in my memory,
even for the 'not too bad to get working' case.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 1, 2004 at 7:10 pm ⇧

Was your implementation the 'not too bad to get working' or the 'doing
it well'?
I thought it did pretty well. But then again, I didn't really much
worry about it or use it much. I merely tested to make sure it did the
right thing and forgot about it.
For instance, there is the greatest common divisor that you need for
normalising the rationals.

I used the Euclidean algorithm for the GCD. Not too bad, certainly
better than using prime factorisation, but as I understand it doing
the job well means using a better algorithm for this - though I never
did bother looking up the details.
I also used Euclid's GCD, but last time I checked, it is a pretty
reasonable algorithm. Runs in log(n) time, where n is the maximum of
either value. Technically, it runs linear in the amount of space that
it takes up, which is about as well as you can do.
Actually, as far as I remember, just doing the arbitrary length
integer division functions took me more than your 45 minutes. The long
division algorithm is simple in principle, but I seem to remember
messing up the decision of how many bits to shift the divisor after a
subtraction. Of course in Python, that's already done.
Ahh, integer division. I solved a related problem with long integers
for Python in a programming competition my senior year of college
(everyone else was using Java, the suckers) in about 15 minutes. We
were to calculate 1/n, for some arbitrarily large n (where 1/n was a
fraction that could be represented by base-10 integer division). Aside
from I/O, it was 9 lines.

Honestly, I never implemented integer division in my rational type. For
casts to floats,
float(self.numerator)/float(self.denominator)+self.whole seemed just
fine (I was using rationals with denominators in the range of 2-100 and
total value < 1000).

Thinking about it now, it wouldn't be very difficult to pull out my 1/n
code and adapt it to the general integer division problem. Perhaps
something to do later.
Maybe I was just having a bad day. Maybe I remember it worse than it
really was. Still, 45 minutes doesn't seem too realistic in my memory,
even for the 'not too bad to get working' case.
For all the standard operations on a rational type, all you need is to
make sure all you have is two pairs of numerators and denominators, then
all the numeric manipulation is trivial:
a.n = a.numerator * a.whole*a.denominator
a.d = a.denominator
b.n = b.numerator * b.whole*b.denominator
b.d = b.denominator

a + b = rational(a.n*b.d + b.n*a.d, a.d*b.d)
a - b = rational(a.n*b.d - b.n*a.d, a.d*b.d)
a * b = rational(a.n*b.n, a.d*b.d)
a / b = rational(a.n*b.d, a.d*b.n)
a ** b, b is an integer >= 1 (binary exponentiation)

One must remember to normalize on initialization, but that's not
difficult. Functionally that's how my rational turned out. It wasn't
terribly full featured, but it worked well for what I was doing.

- Josiah
•  at Feb 2, 2004 at 5:52 pm ⇧

Josiah Carlson <jcarlson at nospam.uci.edu> writes:

One must remember to normalize on initialization, but that's not
difficult. Functionally that's how my rational turned out. It wasn't
terribly full featured, but it worked well for what I was doing.
Straightforward rational implementations *are* easy. But when you
start to look at some of the more subtle numerical issues, life
rapidly gets hard.

The key point (easy enough with Python, but bear with me) is that the
numerator and denominator *must* be infinite-precision integers.
Otherwise, rationals have as many rounding and representational issues
as floating point numbers, and the characteristics of the problems
differ in ways that make them *less* usable without specialist
knowledge, not more.

With Python, this isn't an onerous requirement, as Python Longs fit
the bill nicely. But the next decision you have to make is how often
to normalise. You imply (in your comment above) that you should only
normalise on initialisation, but if you do that, your representation
rapidly blows up, in terms of space used. Sure,
8761348763287654786543876543/17522697526575309573087753086 is the same
as 1/2, but the former uses a lot more space, and is going to be
slower to compute with.

But if you normalise every time, some theoretically simple operations
can become relatively very expensive in terms of time. (Basically,
things like addition, which suddenly require a GCD calculation).

So you have to work out a good tradeoff, which isn't easy.

There are other issues to consider, but that should be enough to
demonstrate the sort of issues an "industrial strength" rational

Of course, this isn't to say that every implementation *needs* to be
industrial-strength. Only the user can say what's good enough for his
needs.

Paul.
--
This signature intentionally left blank
•  at Feb 2, 2004 at 9:55 pm ⇧

But if you normalise every time, some theoretically simple operations
can become relatively very expensive in terms of time. (Basically,
things like addition, which suddenly require a GCD calculation).
If we are to take cues from standard Python numeric types, any
mathematical calculation results in a new immutable object. Thusly,
only normalizing on initialization is sufficient. Since that is the
only time you ever get anything new, doing GCD on initialization is the
minimum and maximum requirement.

- Josiah
•  at Feb 3, 2004 at 4:02 pm ⇧
In article <bvmh58\$4hc\$1 at news.service.uci.edu>,
Josiah Carlson wrote:
But if you normalise every time, some theoretically simple operations
can become relatively very expensive in terms of time. (Basically,
things like addition, which suddenly require a GCD calculation).
If we are to take cues from standard Python numeric types, any
mathematical calculation results in a new immutable object. Thusly,
only normalizing on initialization is sufficient. Since that is the
only time you ever get anything new, doing GCD on initialization is the
minimum and maximum requirement.
I agree, but that means we do a lot of initializations,
so the performance in doing a computation would be about the
same.

I tried a decimal floating-point package just lately, for
fun, based on long mantissas and int exponents. I used this
approach to normalization, because I think it's natural, but
I've been scared to benchmark the package. I should, I
guess.

Regards. Mel.
•  at Feb 7, 2004 at 1:53 am ⇧
Josiah Carlson <jcarlson at nospam.uci.edu> wrote in message news:<bvjj49\$c94\$1 at news.service.uci.edu>...
Was your implementation [of rationals] the 'not too bad to get working' or
the 'doing it well'?
...
For all the standard operations on a rational type, all you need is to
make sure all you have is two pairs of numerators and denominators, then
all the numeric manipulation is trivial: ...
a + b = rational(a.n*b.d + b.n*a.d, a.d*b.d)
a - b = rational(a.n*b.d - b.n*a.d, a.d*b.d)
a * b = rational(a.n*b.n, a.d*b.d)
a / b = rational(a.n*b.d, a.d*b.n)
Also,

floor(a) = a.n // a.d
a // b = floor(a / b)
a ** b, b is an integer >= 1 (binary exponentiation)
It's even more trivial when b=0: The result is 1.

And when b < 0, a ** b can be calculated as (1 / a) ** (-b)
•  at Feb 5, 2004 at 2:18 pm ⇧
Dan Bishop wrote:
For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.
to deal with any form of fractional pennies?
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
•  at Feb 6, 2004 at 1:51 am ⇧

On 5 Feb 2004 09:18:12 -0500, aahz at pythoncraft.com (Aahz) wrote:
Dan Bishop wrote:
For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.
to deal with any form of fractional pennies?
Does it really matter if they did? They may not deal in whole pennies,
but I seriously doubt that they need infinite precision - integers
with a predefined scaling factor (ie fixed point arithmetic) will, I
suspect, handle those few jobs that counting in pennies can't.

For instance, while certainly exchange rates involve fractional
amounts (specified to a fixed number of places), the converted amounts
will be rounded as account balances are recorded to the nearest penny,
unless I'm very badly mistaken. The same applies to interest - the
results get rounded before the balance is affected.

So if the exchange rate is 1.83779 dollars to the uk pound, who can't
cope with the following code?

exchange_rate = 183779

result = pounds * exchange_rate / 100000

Assuming that rounding matches the programming languages default
behaviour, of course, and that the width of the integers is
sufficient.

That said, as I understand it, a lot of financial institutions have a
lot of COBOL code. And from what I remember of programming in COBOL,
the typical representation of numbers in both files and working
storage uses decimal digits stored in a character string - at least
that's what the picture strings specify in the source code. Given that
the compiler knows the precision of every number, and assuming that
there is no conversion to a more convenient representation internally,
it shouldn't make much difference whether the number has a point or
not.

Personally, I wouldn't want to contradict Dan Bishops claims - he has
the experience in a financial institution, not me - but I suspect
there is a fair amount of code used in many financial institutions
that does in fact use a decimal representation, if only because of old
COBOL code.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 6, 2004 at 3:39 am ⇧
In article <qeq520pv7kbd1s3ojmn3idetjuljhtk5md at 4ax.com>,
Stephen Horne wrote:
On 5 Feb 2004 09:18:12 -0500, aahz at pythoncraft.com (Aahz) wrote:
Dan Bishop wrote:
For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.
to deal with any form of fractional pennies?
Does it really matter if they did? They may not deal in whole pennies,
but I seriously doubt that they need infinite precision - integers
with a predefined scaling factor (ie fixed point arithmetic) will, I
suspect, handle those few jobs that counting in pennies can't.
That's mostly true (witness Tim Peters's FixedPoint.py). If you really
want to debate this issue, read Cowlishaw first:
http://www2.hursley.ibm.com/decimal/decarith.html
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
•  at Feb 7, 2004 at 1:38 am ⇧
Stephen Horne <steve at ninereeds.fsnet.co.uk> wrote in message news:<qeq520pv7kbd1s3ojmn3idetjuljhtk5md at 4ax.com>...
On 5 Feb 2004 09:18:12 -0500, aahz at pythoncraft.com (Aahz) wrote:

Dan Bishop wrote:
For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.
to deal with any form of fractional pennies?
Does it really matter if they did? They may not deal in whole pennies,
but I seriously doubt that they need infinite precision - integers
with a predefined scaling factor (ie fixed point arithmetic) will, I
suspect, handle those few jobs that counting in pennies can't.
And you would be right. For example, interest rates were always
stored in thousandths of a percent.

The only problem was that some of the third-party software we used
made this scaling completely visible to the user. Our employees would
occasionally forget the scaling factor, and this resulted in mistakes
like having one of our CD's pay 445% interest instead of 4.45%.
That said, as I understand it, a lot of financial institutions have a
lot of COBOL code. And from what I remember of programming in COBOL,
the typical representation of numbers in both files and working
storage uses decimal digits stored in a character string - at least
that's what the picture strings specify in the source code.
We had a lot of numbers in EBCDIC signed decimal. Even though our
mainframe used ASCII.
•  at Feb 11, 2004 at 3:09 am ⇧
Dan Bishop wrote:
Stephen Horne <steve at ninereeds.fsnet.co.uk> wrote in message news:<qeq520pv7kbd1s3ojmn3idetjuljhtk5md at 4ax.com>...
On 5 Feb 2004 09:18:12 -0500, aahz at pythoncraft.com (Aahz) wrote:
Dan Bishop wrote:
For money, it means that I have *exactly* \$1.80. This is because
"dollars" are just a notational convention for large numbers of cents.
I can just as accuately say that have an (integer) 180 cents, and
indeed, that's exactly the way it would be stored in my financial
institution's database. (I know because I used to work there.) So
all you really need here is "int". But I do agree with the idea of
having a class to hide the decimal/integer conversion from the user.
to deal with any form of fractional pennies?
Does it really matter if they did? They may not deal in whole pennies,
but I seriously doubt that they need infinite precision - integers
with a predefined scaling factor (ie fixed point arithmetic) will, I
suspect, handle those few jobs that counting in pennies can't.
And you would be right. For example, interest rates were always
stored in thousandths of a percent.

The only problem was that some of the third-party software we used
made this scaling completely visible to the user. Our employees would
occasionally forget the scaling factor, and this resulted in mistakes
like having one of our CD's pay 445% interest instead of 4.45%.
...and that's a good argument for having a built-in type that handles
the conversions automatically. Another issue is the different kinds of
rounding. All in all, there are many kinds of already-solved problems
that are taken care of by using the decimal float standard.
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
•  at Jan 30, 2004 at 10:32 pm ⇧

On Fri, 30 Jan 2004 07:06:21 -0800, Michael Chermside wrote:

Facundo Batista writes:
I'm proud to announce that the PEP for Decimal Data Type is now published
http://www.python.org/peps/pep-0327.html
VERY nice work here.

Here's my 2 cents:

(1) You propose conversion from floats via:
Decimal(1.1, 2) == Decimal('1.1')
Decimal(1.1, 16) == Decimal('1.1000000000000001')
Decimal(1.1) == Decimal('110000000000000008881784197001252...e-51')

I think that we'd do even better to ommit the second use. People who
really want to convert floats exactly can easily write "Decimal(1.1, 60)". But
hardly anyone wants to convert floats exactly, while lots of newbies would
forget to include the second parameter. I'd say just make Decimal(someFloat)
raise a TypeError with a helpful message about how you need that second
parameter when using floats.
Good point. A 'DecimalExact' or similar function could perhaps be
provided to replace the simple conversion when people have really
thought about it and do really want it.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 2, 2004 at 2:45 pm ⇧
danb_83 wrote:

#- On the other hand, when I say that I am 1.80 m tall, it doesn't imply
#- that humans height comes in discrete packets of 0.01 m. It
#- means that
#- I'm *somewhere* between 1.795 and 1.805 m tall, depending on my
#- posture and the time of day, and "1.80" is just a convenient
#- approximation. And it wouldn't be inaccurate to express my height as
#- 0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because
#- these are within the tolerance of the measurement. So number base
#- doesn't matter here.

Are you saying that it's ok to store your number imprecisely because you
don't take well measures?

#- But even if the number base of a measurement doesn't matter,
#- precision
#- and speed of calculations often does. And on digital computers,
#- non-binary arithmetic is inherently imprecise and slow. Imprecise
#- because register bits are limited and decimal storage wastes them.
#- (For example, representing the integer 999 999 999 requires
#- 36 bits in
#- BCD but only 30 bits in binary. Also, for floating point,
#- only binary
#- allows the precision-gaining "hidden bit" trick.) Slow because
#- decimal requires more complex hardware. (For example, a BCD
#- more than twice as many gates as a binary adder.)

In my dreams, speed and storage are both infinite, :p

. Facundo
•  at Feb 2, 2004 at 10:07 pm ⇧

At some point, "Batista, Facundo" wrote:

danb_83 wrote:

#- On the other hand, when I say that I am 1.80 m tall, it doesn't imply
#- that humans height comes in discrete packets of 0.01 m. It
#- means that
#- I'm *somewhere* between 1.795 and 1.805 m tall, depending on my
#- posture and the time of day, and "1.80" is just a convenient
#- approximation. And it wouldn't be inaccurate to express my height as
#- 0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because
#- these are within the tolerance of the measurement. So number base
#- doesn't matter here.

Are you saying that it's ok to store your number imprecisely because you
don't take well measures?
What we need for this is an interval type. 1.80 m shouldn't be stored
as '1.80', but as '1.80 +/- 0.005', and operations such as addition
and multiplication should propogate the intervals.

How to do that is another question: for addition, do you add the
magnitudes of the intervals, or use the square root of the sums of the
squares, or something else? It greatly depends on what _type_ of error
0.005 measures (is it the width of a Gaussian distribution? a uniform
distribution? something skewed that's not representable by one
number?).

My 0.0438126 Argentina pesos [1]

[1] \$0.02 Canadian, which hilights the other problem with any
representation of a number without units -- decimal or otherwise.

--
\/|<
/--------------------------------------------------------------------------\
David M. Cooke
cookedm(at)physics(dot)mcmaster(dot)ca
•  at Feb 4, 2004 at 1:59 am ⇧

On Mon, 02 Feb 2004 17:07:52 -0500, cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote:
At some point, "Batista, Facundo" wrote:

danb_83 wrote:

#- On the other hand, when I say that I am 1.80 m tall, it doesn't imply
#- that humans height comes in discrete packets of 0.01 m. It
#- means that
#- I'm *somewhere* between 1.795 and 1.805 m tall, depending on my
#- posture and the time of day, and "1.80" is just a convenient
#- approximation. And it wouldn't be inaccurate to express my height as
#- 0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because
#- these are within the tolerance of the measurement. So number base
#- doesn't matter here.

Are you saying that it's ok to store your number imprecisely because you
don't take well measures?
What we need for this is an interval type. 1.80 m shouldn't be stored
as '1.80', but as '1.80 +/- 0.005', and operations such as addition
and multiplication should propogate the intervals.
I disagree with this, not because it is a bad idea to keep track of
precision, but because this should not be a part of the float type or
of basic arithmetic operations.

When you write a value with its precision specified in the form of an
interval, that interval is a second number. The value with the
precision is a compound representation, built up using simpler
components. It doesn't mean that the components no longer have uses
outside of the compound. In Python, the same should apply - a numeric
type that can track precision sounds useful, but it shouldn't replace
the existing float.

One good reason is simply that knowledge of the precision is only
sometimes useful. As an obvious example, what would the point be of
keeping track of the precision of the calculations in a 3D game -
there is no point as the information about precision has no bearing on
the rendering of the image.

Besides this, there is a much more fundamental problem.

The whole point of using an imprecise representation is because
manipulating a perfect representation is impractical - mainly slow.

It is true that in general the source is inherently approximate too,
meaning that floats are a quite a good match for the physical
measurements they are often used to represent, but still if it were
practical to do perfect arithmetic on those approximate values it
would give slightly more precise answers as the arithmetic would not

Having an approximate representation with an interval sounds good, but
remember that one error source is the arithmetic itself - e.g. 1.0 /
3.0 cannot be finitely represented in either binary or decimal without
error (except as a rational, of course).

How to do that is another question: for addition, do you add the
magnitudes of the intervals, or use the square root of the sums of the
squares, or something else? It greatly depends on what _type_ of error
0.005 measures (is it the width of a Gaussian distribution? a uniform
distribution? something skewed that's not representable by one
number?).
None of these is sufficient - they may track the errors resulting from
measurement issues (if you choose the appropriate method for your
application) but neither takes into account errors resulting from the
imprecision of the arithmetic. Furthermore, to keep track of such
imprecision precisely means you need an infinitely precise numeric
representation for your interval - and if it was practical to do that,
it would be far better to just use that representation for the value
itself.

This doesn't mean that tracking precision is a bad idea. It just means
that when it is done, the error interval itself should be imprecise.
You should have the guarantee that the real value is never going to be
outside of the given bounds, but not the guarantee that the bounds are
as close together as possible - the bounds should be allowed to get a
little further apart to allow for imprecision in the calculation of
the interval.

And if the error interval is itself an approximation, why track it on
every single arithmetic operation? Unless you have a specific good
reason to do so, it makes much more sense to handle the precision
tracking at a higher level. And as those higher level operations are
often going to be application specific, having a single library for it
(ie not tailored to some particular type of task) is IMO unlikely to
work.

For instance, consider calculating and applying a 3D rotation matrix
to a vector. If you track errors on every float value, that is 9
values in the matrix with error values (due to limited precision trig
functions etc) and 3 values in the vector, a dozen for the
intermediate results in the matrix multiplication, and 3 error
intervals for the 3 dimensions of the output vector. But the odds are
that all you want is a single float value - the maximum distance
between the real point and the point represented by the output vector,
and you can probably get a good value for that by multiplying the
length of the input vector by some 'potential error from rotation'
constant.

Incidentally, it would not always be appropriate to include arithmetic
errors in error intervals. For instance, some statistical interval
types do not guarantee that all values are within the interval range.
They may guarantee that 95% of values are within the interval, for
instance - _and_ that 5% of values are outside the interval. The 5%
outside is as important as the 95% inside, so there is no acceptable
direction to move the bounds a little 'just to be safe'.

In some cases, you might even want to track the error interval (from
arithmetic error) for your error interval value. I can certainly
imagine a result with the form...

The average widginess of a blodgit is 9.5 +/- 0.2
95% differ from the average by less than 2.7 +/- 0.03

Thus I can say that this randomly chosen blodgit has a
widginess of (9.5 +/- 0.2) +/- (2.7 +/- 0.03) with 95% confidence.

You might even get results like that it you had estimated the average
and distribution of widginess from a sample of the blodgits - in which
case, you may still need to account from the arithmetic error which
requires potentially another four values ;-)

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 4, 2004 at 7:52 pm ⇧

At some point, Stephen Horne wrote:

On Mon, 02 Feb 2004 17:07:52 -0500, cookedm+news at physics.mcmaster.ca
(David M. Cooke) wrote:
At some point, "Batista, Facundo" wrote:

danb_83 wrote:

#- On the other hand, when I say that I am 1.80 m tall, it doesn't imply
#- that humans height comes in discrete packets of 0.01 m. It
#- means that
#- I'm *somewhere* between 1.795 and 1.805 m tall, depending on my
#- posture and the time of day, and "1.80" is just a convenient
#- approximation. And it wouldn't be inaccurate to express my height as
#- 0x1.CC (=1.796875) or (base 12) 1.97 (=1.7986111...) meters, because
#- these are within the tolerance of the measurement. So number base
#- doesn't matter here.

Are you saying that it's ok to store your number imprecisely because you
don't take well measures?
What we need for this is an interval type. 1.80 m shouldn't be stored
as '1.80', but as '1.80 +/- 0.005', and operations such as addition
and multiplication should propogate the intervals.
I disagree with this, not because it is a bad idea to keep track of
precision, but because this should not be a part of the float type or
of basic arithmetic operations.
I was being a bit facetious :-) This is certainly something that can
be done without being builtin, like this:
http://pedro.dnp.fmph.uniba.sk/~stanys/Uncertainities.py
Having an approximate representation with an interval sounds good, but
remember that one error source is the arithmetic itself - e.g. 1.0 /
3.0 cannot be finitely represented in either binary or decimal without
error (except as a rational, of course).
Hey, if my measurement error is so small that arithmetic error becomes
significant, I'm happy.

--
\/|<
/--------------------------------------------------------------------------\
David M. Cooke
cookedm(at)physics(dot)mcmaster(dot)ca
•  at Feb 4, 2004 at 9:01 pm ⇧

On Wed, 04 Feb 2004 14:52:42 -0500, cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote:

I was being a bit facetious :-)
Ah - sorry for taking it the wrong way.

--
Steve Horne

steve at ninereeds dot fsnet dot co dot uk
•  at Feb 6, 2004 at 3:58 pm ⇧
On Wed, 04 Feb 2004 01:59:41 +0000, Stephen Horne wrote:
[...]
A bunch of stuff including stuff about intervals which probably could
benefit from revision in the light of, e.g.,

http://www.americanscientist.org/template/AssetDetail/assetid/28331;jsessionid=aaa41kNy_Uu1-c

or the whole in "printer-friendly" format

http://www.americanscientist.org/template/AssetDetail/assetid/28331/page/3?&print=yes

or the .pdf (nicer) at

http://www.americanscientist.org/template/PDFDetail/assetid/28315;jsessionid=aaa41kNy_Uu1-c

http://www.cs.utep.edu/interval-comp/

Regards,
Bengt Richter
•  at Feb 3, 2004 at 12:33 pm ⇧
cookedm wrote:

#- What we need for this is an interval type. 1.80 m shouldn't be stored
#- as '1.80', but as '1.80 +/- 0.005', and operations such as addition
#- and multiplication should propogate the intervals.

I think this kind of math is beyond a pure numeric data type. 1.80 is to be
represented as a numeric data type. And also 0.005.

But '1.80 +/- 0.005' should be worked in another object. Hey! These are the
benefits of OOP!

. Facundo
•  at Feb 6, 2004 at 5:03 pm ⇧

On Tue, 3 Feb 2004 09:33:26 -0300, "Batista, Facundo" wrote:
cookedm wrote:

#- What we need for this is an interval type. 1.80 m shouldn't be stored
#- as '1.80', but as '1.80 +/- 0.005', and operations such as addition
#- and multiplication should propogate the intervals.

I think this kind of math is beyond a pure numeric data type. 1.80 is to be
represented as a numeric data type. And also 0.005.

But '1.80 +/- 0.005' should be worked in another object. Hey! These are the
benefits of OOP!
The key concern is _exactly_ representing the limits of an interval that is
_guaranteed to contain_ the exact value of interest. One hopes to represent
very narrow intervals, but the principle is the same irrespective of the available
computer states available to represent the end points.

E.g., integer intervals can reliably enclose 1.8 and 0.005
(with [1,2] and [0,1] respectively). Of course, [1,2] +- [0,1]
=> [0,3] gets you something less than useful for 1.8+-0.005

But choosing from available IEEE-754 floating point double states
gets you some really narrow intervals, where e.g. 1.8 can be guaranteed to
be in the closed interval including the two nearest available
exactly-representable floating point numers, namely

[1.8000000000000000444089209850062616169452667236328125,
1.79999999999999982236431605997495353221893310546875]

I'll leave it as an exercise to work out the exactly representable value
interval limits for 0.005 and 1.8+-0.005 ;-)

The _meaning_ of numbers that are guaranteed to fall into known exact intervals
in terms of representing measurements, measurement errors, statistics of the
errors, etc. is a separate matter from keeping track of exact intervals during
computation. These concerns should not be confused, IMO, though they inevitably
arise together in thinking about computing with real-life measurement values.

Regards,
Bengt Richter
•  at Feb 6, 2004 at 7:25 pm ⇧

On 6 Feb 2004 17:03:57 GMT, bokr at oz.net (Bengt Richter) wrote:
The _meaning_ of numbers that are guaranteed to fall into known exact intervals
in terms of representing measurements, measurement errors, statistics of the
errors, etc. is a separate matter from keeping track of exact intervals during
computation. These concerns should not be confused, IMO, though they inevitably
arise together in thinking about computing with real-life measurement values.
(Warning, naive hobbyist input, practicality: undefined)

One possible option would be to provide for some kind of random
rounding routine for some of the least significant bits of a floating
point value. The advantage would be that this would also be usable for
DSP-like computations that are used in music programming (volume
adjustments) or in digital video (image rotation).

I agree with the idea that exact interval tracking is important, but
perhaps this exact interval tracking should be used only during
testing and development of the code.

It could be that it would be possible to produce code with a fixed
number of least significant bits that are randomly rounded each time
some specific operation makes this necessary (not *all* computations!)
and that the floating point data would stay accurate enough for long
enough to be useable in 99.9 percent of the use cases.

Maybe we need a DSP-float instead of a decimal data type? Decimals
could be used for testing DSP-float implementations.

Anton
•  at Feb 8, 2004 at 6:55 am ⇧

anton at vredegoor.doge.nl (Anton Vredegoor) wrote:
One possible option would be to provide for some kind of random
rounding routine for some of the least significant bits of a floating
point value.
answer with 15 decimal places, but now you have non-determinism. The real
answer, I think, is getting people to understand how much of their
real-world measurements are garbage.
The advantage would be that this would also be usable for
DSP-like computations that are used in music programming (volume
adjustments) or in digital video (image rotation).
Interesting. I know you were kind of talking off the top of your head, but
can you tell me what leads you to thinking that some low-order randomness
would be helpful in those particular applications?
Maybe we need a DSP-float instead of a decimal data type? Decimals
could be used for testing DSP-float implementations.
Can you describe what you mean by DSP-float? I'm not sure why a DSP should
treat floats any differently than an ordinary processor.
--
- Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
•  at Feb 9, 2004 at 12:43 pm ⇧

Tim Roberts wrote:
anton at vredegoor.doge.nl (Anton Vredegoor) wrote:
One possible option would be to provide for some kind of random
rounding routine for some of the least significant bits of a floating
point value.
answer with 15 decimal places, but now you have non-determinism. The real
answer, I think, is getting people to understand how much of their
real-world measurements are garbage.
Yes, but this is not a simple matter. There is some kind of order long
after strict methods become unwieldy. An intelligent rounding scheme
could harness some of this partial order to keep the computations more
accurate over a wider range of manipulations on real world data.

I'm providing some code below to show that there is order beyond
determinism. It's not very helpful in an explicit way, but it should
serve to prove the point for someone wanting to look at it for long
enough and willing to check the code for some exact deterministic
explanation, and being unable to formalize it :-)

Also it's not bad to look at even for those not wanting to
investigate, so it might help to prevent possible tension in this
discussion a bit.
The advantage would be that this would also be usable for
DSP-like computations that are used in music programming (volume
adjustments) or in digital video (image rotation).
Interesting. I know you were kind of talking off the top of your head, but
can you tell me what leads you to thinking that some low-order randomness
would be helpful in those particular applications?
There are high end digital mixers that use some kind of random
rounding to the least significant bits of their sample data in order
to make the sounds "survive" more manipulations before the effect of
the manipulations becomes audible.

In digital video with image rotation there is the problem of
determining where an object exactly is after it is rotated, because
all of its coordinate points have been rounded. A statistic approach
seems to work well here.

On a more cosmic scale the universe seems to use the same trick of
indeterminism, at least according to quantum theory and the Heisenberg
uncertainty principle. Some think that because of that the universe
itself must be a computer simulation :-) I guess I'd better stop here
before someone mentions Douglas Adams ...
Maybe we need a DSP-float instead of a decimal data type? Decimals
could be used for testing DSP-float implementations.
Can you describe what you mean by DSP-float? I'm not sure why a DSP should
treat floats any differently than an ordinary processor.
You are right, a DSP is just like an ordinary processor, except that
it is specialized for digital signal processing operations. I guess I
got a bit carried away by thinking about a datatype that has builtin
random rounding for the least significant bits. For example by using
the Mersenne twisted random generator, it could compute a lot of
rounding bytes at once and just use them up as needed. This way it
would not slow down the computations too much.

Anton

from __future__ import division
from Tkinter import *
from random import random,choice

class Scaler:

def __init__(self, world, viewport):
(a,b,c,d), (e,f,g,h) = world, viewport
xf,yf = self.xf,self.yf = (g-e)/(c-a),(h-f)/(d-b)
wxc,wyc = (a+c)/2, (b+d)/2
vxc,vyc = (e+g)/2, (f+h)/2
self.xc,self.yc = vxc-xf*wxc,vyc-yf*wyc

def scalepoint(self, a, b):
xf,yf,xc,yc = self.xf,self.yf,self.xc,self.yc
return xf*a+xc,yf*b+yc

def scalerect(self, a, b, c, d):
xf,yf,xc,yc = self.xf,self.yf,self.xc,self.yc
return xf*a+xc,yf*b+yc,xf*c+xc,yf*d+yc

class RandomDot:

def __init__(self, master, n):
self.master = master
self.n = n
self.world = (0,0,1,1)
c = self.canvas = Canvas(master, bg = 'black',
width = 380, height = 380)
c.pack(fill = BOTH, expand = YES)
master.bind("<Configure>", self.configure)
master.bind("<Escape>", lambda
event ='ignored', m=master: m.destroy())
self.canvas.bind("<Button-1>", self.click)
self.colorfuncs = {'red':(min,min),'green':(min,max),
'blue':(max,min), 'white':(max,max)}
self.polling = False

def poll(self):
self.wriggle()
self.master.after(10, self.poll)

def click(self, event):
self.draw()

def configure(self,event):
self.scale = Scaler(self.world, self.getviewport())
self.draw()
if not self.polling:
self.polling = True
self.poll()

def draw(self):
c,sp = self.canvas,self.scale.scalepoint
c.delete('all')
funcs = self.colorfuncs
colors = funcs.keys()
for i in xrange(1000):
color = choice(colors)
a,b = sp(random(), random())
c.create_oval(a,b,a+5,b+5,fill=color,
outline = '')

def wriggle(self):
c,sp = self.canvas,self.scale.scalepoint
funcs = self.colorfuncs
x = choice(c.find_all())
color = c.itemcget(x,"fill")
f1,f2 = funcs[color]
a = f1([random() for i in xrange(self.n)])
b = f2([random() for i in xrange(self.n)])
a,b = sp(a,b)
c.coords(x,a,b,a+5,b+5)

def getviewport(self):
c = self.canvas
return (0, 0, c.winfo_width(),c.winfo_height())

if __name__=='__main__':
root = Tk()
root.title('randomdot')
app = RandomDot(root,3)
root.mainloop()
•  at Feb 9, 2004 at 5:47 pm ⇧

On Fri, 06 Feb 2004 20:25:21 +0100, anton at vredegoor.doge.nl (Anton Vredegoor) wrote:
On 6 Feb 2004 17:03:57 GMT, bokr at oz.net (Bengt Richter) wrote:

The _meaning_ of numbers that are guaranteed to fall into known exact intervals
in terms of representing measurements, measurement errors, statistics of the
errors, etc. is a separate matter from keeping track of exact intervals during
computation. These concerns should not be confused, IMO, though they inevitably
arise together in thinking about computing with real-life measurement values.
(Warning, naive hobbyist input, practicality: undefined)

One possible option would be to provide for some kind of random
rounding routine for some of the least significant bits of a floating
point value. The advantage would be that this would also be usable for
DSP-like computations that are used in music programming (volume
adjustments) or in digital video (image rotation).
I can't spend a lot of time on this right now, but this reminds me of
a time when I tried (sucessfully IMO) to explain why feeding a simulation
system with very low noise data got more accurate results than feeding it
exact data.

The reason has to do with quantization (which was part of the system being
simulated, and which could be fed with highly accurate world-sim values plus
noise). I.e., measurements are always represented digitally with some least
significat bit representing some defined amount of a measured quantity.
This means measurement information below that is lost (or at least one bit
below that, depending the device).

The result is that a statistical mean (or other integrating process) of samples
will not be affected by the bits lost in quantizing. In the case of feeding a
simulator with accurate values multiple times, this results in the identical
biased quantized values, whereas if you add a small amount of noise, you will
get a few neighboring quantized values in some proportion, and the mean will
be a better estimate of the true (unquantized) value that a mean of quantized
values with no noise -- where all the quantized values are exactly equal and
all biased. The effect can be amplified if the input is feeding a sensitive
calculation such as the inversion of a near-singular matrix, and can make the
difference between usable and useless results.

An example using int as the quantization function:
import random
def simval(val, noise=1.0):
... return val + noise*random.random()
...
def simulator(val, noise, trials00):
... return sum([int(simval(val, noise)) for i in xrange(trials)])/float(trials)
...
for i in xrange(10): print simulator(1.3, 0.0),
...
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
for i in xrange(10): print simulator(1.3, 1.0),
...
1.295 1.293 1.284 1.307 1.3 1.292 1.322 1.291 1.322 1.315

I suspect that the ear integrates/averages some when presented with 44.1k samples/sec,
so if uniform noise is added in below the quantization lsb of a CD, that may enhance
the perceived output sound, but some audiophile can provide the straight scoop on that.
I agree with the idea that exact interval tracking is important, but
perhaps this exact interval tracking should be used only during
testing and development of the code.

It could be that it would be possible to produce code with a fixed
number of least significant bits that are randomly rounded each time
some specific operation makes this necessary (not *all* computations!)
and that the floating point data would stay accurate enough for long
enough to be useable in 99.9 percent of the use cases.
I think you have to be careful when you do your rounding, and note
the effect on values vs populations of values and how that feeds the
next stage of processing or use.
Maybe we need a DSP-float instead of a decimal data type? Decimals
could be used for testing DSP-float implementations.
I'm not sure what DSP-float really means yet ;-)
HTH, gotta go.

Regards,
Bengt Richter

## Related Discussions

Discussion Overview
 group python-list categories python posted Jan 30, '04 at 12:49p active Feb 11, '04 at 3:09a posts 40 users 15 website python.org

### 15 users in discussion

Content

People

Support

Translate

site design / logo © 2018 Grokbase