FAQ
I'm aggravated by this behavior in python:

x = "4"
print x < 7 # prints False

The issue, of course, is comparisons of incompatible types. In most
languages this throws an error (in Perl the types are converted
silently). In Python this comparison fails silently. The
documentation says: "objects of different types *always* compare
unequal, and are ordered consistently but arbitrarily."

I can't imagine why this design decision was made. I've been bitten by
this several times (reading data from a file and not converting the
numbers before comparison). Can I get this to throw an error instead
of failing silently?

Thanks,
-Tom

Search Discussions

  • Peter Otten at Dec 6, 2010 at 5:04 pm

    TomF wrote:

    I'm aggravated by this behavior in python:

    x = "4"
    print x < 7 # prints False

    The issue, of course, is comparisons of incompatible types. In most
    languages this throws an error (in Perl the types are converted
    silently). In Python this comparison fails silently. The
    documentation says: "objects of different types *always* compare
    unequal, and are ordered consistently but arbitrarily."

    I can't imagine why this design decision was made. I've been bitten by
    this several times (reading data from a file and not converting the
    numbers before comparison). Can I get this to throw an error instead
    of failing silently?
    This change would break a lot of code, so it could not be made within the
    2.x series. However:

    Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00)
    [GCC 4.4.1] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    "4" < 7
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unorderable types: str() < int()

    Peter
  • TomF at Dec 6, 2010 at 5:16 pm

    On 2010-12-06 09:04:00 -0800, Peter Otten said:

    TomF wrote:
    I'm aggravated by this behavior in python:

    x = "4"
    print x < 7 # prints False

    The issue, of course, is comparisons of incompatible types. In most
    languages this throws an error (in Perl the types are converted
    silently). In Python this comparison fails silently. The
    documentation says: "objects of different types *always* compare
    unequal, and are ordered consistently but arbitrarily."

    I can't imagine why this design decision was made. I've been bitten by
    this several times (reading data from a file and not converting the
    numbers before comparison). Can I get this to throw an error instead
    of failing silently?
    This change would break a lot of code, so it could not be made within the
    2.x series. However:

    Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00)
    [GCC 4.4.1] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    "4" < 7
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unorderable types: str() < int()
    Thanks. I was hoping there was something I could do for 2.x but I
    suppose this will have to do.

    But I'm mystified by your statement, "this change would break a lot of
    code". Given that the semantics are virtually random, how could code
    depend on this?

    -Tom
  • Robert Kern at Dec 6, 2010 at 5:46 pm

    On 12/6/10 11:16 AM, TomF wrote:
    On 2010-12-06 09:04:00 -0800, Peter Otten said:
    This change would break a lot of code, so it could not be made within the
    2.x series. However:

    Python 3.1.1+ (r311:74480, Nov 2 2009, 15:45:00)
    [GCC 4.4.1] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    "4" < 7
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unorderable types: str() < int()
    Thanks. I was hoping there was something I could do for 2.x but I suppose this
    will have to do.

    But I'm mystified by your statement, "this change would break a lot of code".
    Given that the semantics are virtually random, how could code depend on this?
    There are cases where you don't particularly care *what* order is given as long
    as it is consistent. Let's say you want to make sure that two lists have the
    same contents (which may mix types), but you don't care about the order. You
    could just sort each list and then compare the sorted lists. Before sets were
    added to the language, this was a fairly common approach.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
  • Terry Reedy at Dec 6, 2010 at 7:34 pm

    On 12/6/2010 12:46 PM, Robert Kern wrote:
    On 12/6/10 11:16 AM, TomF wrote:

    Given that the semantics are virtually random, how could code depend
    on this?
    There are cases where you don't particularly care *what* order is given
    as long as it is consistent. Let's say you want to make sure that two
    lists have the same contents (which may mix types), but you don't care
    about the order. You could just sort each list and then compare the
    sorted lists. Before sets were added to the language, this was a fairly
    common approach.
    And indeed, code like this that has not been updated does break in 3.x.
    to some people's annoyance. We really really cannot please everyone ;-).

    --
    Terry Jan Reedy
  • Mark Wooding at Dec 6, 2010 at 9:31 pm

    Terry Reedy <tjreedy at udel.edu> writes:

    And indeed, code like this that has not been updated does break in
    3.x. to some people's annoyance. We really really cannot please
    everyone ;-).
    The problem is that there are too many useful properties that one might
    expect from comparison operators. For example, it's frequently nice to
    have a total ordering on all objects. For real numbers, it's nice that
    the ordering obey the usual ordered-field axioms; but the complex
    numbers don't have an ordering compatible with the field operators, and
    imposing a default ordering (e.g., degree-lexicographic) is probably
    asking for trouble.

    I agree that the Python 3 behaviour is an improvement, by the way.

    -- [mdw]
  • Tim Golden at Dec 6, 2010 at 5:04 pm

    On 06/12/2010 16:59, TomF wrote:
    I'm aggravated by this behavior in python:

    x = "4"
    print x < 7 # prints False

    The issue, of course, is comparisons of incompatible types. In most
    languages this throws an error (in Perl the types are converted
    silently). In Python this comparison fails silently. The documentation
    says: "objects of different types *always* compare unequal, and are
    ordered consistently but arbitrarily."

    I can't imagine why this design decision was made. I've been bitten by
    this several times (reading data from a file and not converting the
    numbers before comparison). Can I get this to throw an error instead of
    failing silently?
    Yes: switch to python 3 where this does throw an exception:

    Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52
    Type "help", "copyright", "credits" or "license
    7 < "eight"
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unorderable types: int() < str()
    >>>

    TJG
  • Steven D'Aprano at Dec 7, 2010 at 12:17 am

    On Mon, 06 Dec 2010 08:59:12 -0800, TomF wrote:

    I'm aggravated by this behavior in python:

    x = "4"
    print x < 7 # prints False
    I can't imagine why this design decision was made.
    You've never needed to deal with an heterogeneous list?

    data = ["Fred", "Barney", 2, 1, None]
    data.sort()

    Nevertheless, I agree that in hindsight, the ability to sort such lists
    is not as important as the consistency of comparisons.


    --
    Steven
  • John Nagle at Dec 7, 2010 at 11:08 pm

    On 12/6/2010 4:17 PM, Steven D'Aprano wrote:
    On Mon, 06 Dec 2010 08:59:12 -0800, TomF wrote:

    I'm aggravated by this behavior in python:

    x = "4"
    print x< 7 # prints False
    I can't imagine why this design decision was made.
    You've never needed to deal with an heterogeneous list?

    data = ["Fred", "Barney", 2, 1, None]
    data.sort()

    Nevertheless, I agree that in hindsight, the ability to sort such lists
    is not as important as the consistency of comparisons.
    If you're thinking hard about this, I recommend viewing Alexander
    Stepanov's talk at Stanford last month:

    http://www.stanford.edu/class/ee380/Abstracts/101103.html

    He makes the point that, for generic programs to work right, the
    basic operations must have certain well-defined semantics. Then
    the same algorithms will work right across a wide variety of
    objects.

    This is consistent with Python's "duck typing", but inconsistent
    with the current semantics of some operators.

    For example, "+" as concatenation makes "+" non-commutative.
    In other words,

    a + b

    is not always equal to

    b + a

    which is not good.

    Important properties to have across all types:

    a + b == b + a

    Exactly one of

    a > b
    a = b
    a < b

    is true, or an type exception must be raised.

    The basic Boolean identities

    (a or b) == (b or a)
    not (a or b) == (not a) and (not b)
    not (not a) == a

    should all hold, or an type exception should be raised.
    With Python accepting both "True" and "1" as sort of
    equivalent, there are cases where those don't hold.

    John Nagle
  • Carl Banks at Dec 7, 2010 at 11:31 pm

    On Dec 7, 3:08?pm, John Nagle wrote:
    The basic Boolean identities

    ? ? ? ? (a or b) == (b or a)
    ? ? ? ? not (a or b) == (not a) and (not b)
    ? ? ? ? not (not a) == a

    should all hold, or an type exception should be raised.
    With Python accepting both "True" and "1" as sort of
    equivalent, there are cases where those don't hold.
    For better or worse (and I say worse, but YMMV) "and" and "or" are not
    boolean operators in Python but special-form expressions that resemble
    boolean semantics in some instances, but not (as you mention above) in
    others.

    Likewise, the comparison operators <, >, >=, and <= aren't well-
    ordered; sets use these operators to indicate topological ordering.

    IMO having the operators adhere to defined properties would be a good
    thing. It would improve code reusability since the operators could be
    expected to act in consistent ways, but Python isn't that language.
    So you might as well use the operators for whatever seems like it
    works, + for concatenation, > for superset, and so on.


    Carl Banks
  • Mark Wooding at Dec 7, 2010 at 11:59 pm
    John Nagle <nagle at animats.com> writes:

    [Stepanov]
    makes the point that, for generic programs to work right, the basic
    operations must have certain well-defined semantics. Then the same
    algorithms will work right across a wide variety of objects.

    This is consistent with Python's "duck typing", but inconsistent with
    the current semantics of some operators.
    This isn't a disaster. You should check that the arguments define the
    necessary operations and obey the necessary axioms. Python is already
    dynamically typed: this kind of proof-obligation is already endemic in
    Python programming, so you've not lost anything significant.
    For example, "+" as concatenation makes "+" non-commutative. In other
    words,

    a + b

    is not always equal to

    b + a

    which is not good.
    I think I probably agree with this. Concatenation yields a nonabelian
    monoid (usually with identity); `+' is pretty much universally an
    abelian group operator (exception: natural numbers, where it's used in
    an abelian monoid which extends to a group in a relatively obvious way).
    But then we'd need another operator symbol for concatenation.
    Nonnegative integers act on strings properly, but the action doesn't
    distribute over concatenation, which is also a shame. i.e.,

    n*(a + b) != n*a + n*b

    But it's a familiar notation, by no means peculiar to Python, and
    changing it would be difficult.
    Exactly one of

    a > b
    a = b
    a < b

    is true, or an type exception must be raised.
    This will get the numerical people screaming. Non-signalling NaNs are
    useful, and they don't obey these axioms.

    I think, more generally, that requiring a full total order (rather than
    either a preorder or a partial order) is unnecessarily proscriptive.
    Sorting only requires a preorder, for example, i.e., { (a, b) | a <= b
    <= a } is an equivalence relation, and the preorder naturally induces a
    total order on the equivalence classes. Topological sorting requires
    only a partial order, and makes good use of the notation. As an
    example, sets use `<=' to denote subsetness, which is well known to be a
    partial order.

    (I presume you weren't going to deny

    a <= b iff a < b or a == b

    or

    a < b iff b > a

    because that really would be bad.)
    The basic Boolean identities

    (a or b) == (b or a)
    not (a or b) == (not a) and (not b)
    not (not a) == a

    should all hold, or an type exception should be raised.
    The first of these contradicts the axiom

    x => x or _|_ == x

    which is probably more useful. The last can't usefully be true since
    `not' is lossy. But I think that, for all values a, b,

    not (a or b) == not (b or a) == (not a) and (not b)
    not (not (not a)) == not a

    which is probably good enough. (The application of `not' applies a
    boolean coercion, which canonifies adequately.)

    -- [mdw]
  • John Nagle at Dec 8, 2010 at 9:01 pm

    On 12/7/2010 3:59 PM, Mark Wooding wrote:
    Exactly one of
    a> b
    a = b
    a< b

    is true, or an type exception must be raised.
    This will get the numerical people screaming. Non-signalling NaNs are
    useful, and they don't obey these axioms.
    As a sometime numerical person, I've been screaming at this from
    the other side. The problem with comparing non-signalling NaNs is that
    eventually, the program has to make a control flow decision, and it
    may not make it correctly.

    I used to do dynamic simulation engines for animation. I was
    probably the first person to get ragdoll physics to work right,
    back in 1996-1997. In hard collisions, the program would get
    floating point overflows, and I had to abort the interation, back
    up, cut the time step down, and go forward again, until the time
    step was small enough to allow stable integration. This was
    under Windows on x86, where it's possible, in a Windows-dependent
    way, to catch signalling NaNs and turn the hardware exception into
    a C++ exception. If the computation just plowed ahead with
    non-signalling NaNs, with a check at the end, it could go wrong
    and produce bad results, because incorrect branches would be taken
    and the final bogus results might not contain NaNs.

    I personally think that comparing NaN with numbers or other
    NaNs should raise an exception. There's no valid result for
    such comparisons.

    John Nagle
  • Steven D'Aprano at Dec 9, 2010 at 3:30 am

    On Wed, 08 Dec 2010 13:01:28 -0800, John Nagle wrote:
    On 12/7/2010 3:59 PM, Mark Wooding wrote:
    Exactly one of
    a> b
    a = b
    a< b

    is true, or an type exception must be raised.
    This will get the numerical people screaming. Non-signalling NaNs are
    useful, and they don't obey these axioms.
    As a sometime numerical person, I've been screaming at this from
    the other side. The problem with comparing non-signalling NaNs is that
    eventually, the program has to make a control flow decision, and it may
    not make it correctly.
    Then use signalling NANs. Nobody is suggesting that quiet NANs should be
    compulsory, or are the solution for all problems. But they're a solution
    for some problems, which is why people use them.

    [...]
    I personally think that comparing NaN with numbers or other
    NaNs should raise an exception. There's no valid result for such
    comparisons.
    If NAN and 1 are unordered, then NAN is not less or equal to 1, nor is it
    larger than 1. Hence both NAN <= 1 and NAN >= 1 are false. The problem
    only comes when the caller mistakenly thinks that floats are real
    numbers, and tries to reason about floats like they would reason about
    real numbers.



    --
    Steven
  • Geremy condra at Dec 9, 2010 at 3:56 am

    On Wed, Dec 8, 2010 at 1:01 PM, John Nagle wrote:
    On 12/7/2010 3:59 PM, Mark Wooding wrote:

    Exactly one of
    ? ? ? a> ?b
    ? ? ? a = b
    ? ? ? a< ?b

    ?is true, or an type exception must be raised.
    This will get the numerical people screaming. ?Non-signalling NaNs are
    useful, and they don't obey these axioms.
    ? As a sometime numerical person, I've been screaming at this from
    the other side. ? The problem with comparing non-signalling NaNs is that
    eventually, the program has to make a control flow decision, and it
    may not make it correctly.

    ? I used to do dynamic simulation engines for animation. ?I was
    probably the first person to get ragdoll physics to work right,
    back in 1996-1997. ?In hard collisions, the program would get
    floating point overflows, and I had to abort the interation, back
    up, cut the time step down, and go forward again, until the time
    step was small enough to allow stable integration. ?This was
    under Windows on x86, where it's possible, in a Windows-dependent
    way, to catch signalling NaNs and turn the hardware exception into
    a C++ exception. ?If the computation just plowed ahead with
    non-signalling NaNs, with a check at the end, it could go wrong
    and produce bad results, because incorrect branches would be taken
    and the final bogus results might not contain NaNs.

    ? I personally think that comparing NaN with numbers or other
    NaNs should raise an exception. ?There's no valid result for
    such comparisons.
    This, in big letters.

    Geremy Condra
  • John Nagle at Dec 9, 2010 at 4:16 am

    On 12/8/2010 7:56 PM, geremy condra wrote:
    On Wed, Dec 8, 2010 at 1:01 PM, John Naglewrote:
    On 12/7/2010 3:59 PM, Mark Wooding wrote:

    Exactly one of
    a > b
    a = b
    a < b

    is true, or an type exception must be raised.
    Here's an example where this issue produces invalid results in Python.
    NaN = float("nan")
    arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0,
    3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
    sorted(arr)
    [0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
    6.0, nan, 4.0, nan, 6.0, nan]

    The sorted numerical values aren't in order. Note the 4.0 near the
    end, after the 6.0. "sort" has failed because it assumes that
    a < b and b < c implies a < c. But that's not a valid assumption here.

    It's not good to break trichotomy.

    John Nagle
  • Steven D'Aprano at Dec 9, 2010 at 7:58 am

    On Wed, 08 Dec 2010 20:16:57 -0800, John Nagle wrote:

    Here's an example where this issue produces invalid results in
    Python.
    NaN = float("nan")
    arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0,
    3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
    sorted(arr)
    [0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
    6.0, nan, 4.0, nan, 6.0, nan]

    The sorted numerical values aren't in order. Note the 4.0 near the end,
    after the 6.0. "sort" has failed because it assumes that a < b and b <
    c implies a < c. But that's not a valid assumption here.

    It's not good to break trichotomy.
    It's perfectly fine to break trichotomy. People do it all the time --
    they prefer chicken to pizza, pizza to steak, but prefer steak to
    chicken. (Modulo whatever foods you prefer. Or sports, or politicians, or
    any one of many things which don't make up an equivalence relationship.)
    Equivalence relationships make up only a tiny portion of relationships,
    and it is both useful and conventional to use the same operators in both
    situations.

    What's not good is to assume trichotomy when it doesn't apply, but that's
    no different from any other faulty assumption, e.g. a coder who assumes
    multiplication is always commutative may be puzzled why his matrix
    calculations are frequently wrong.

    Python's sort assumes trichotomy, even when sorting floats. Perhaps it
    shouldn't. But one way or another, that's an issue with sort, not with
    the idea that there are data types where trichotomy doesn't apply. And
    like it or not, floats are one of those data types, just as neither
    commutativity nor associativity applies to floats -- even excluding NANs
    and INFs.



    --
    Steven
  • Terry Reedy at Dec 9, 2010 at 8:36 am

    On 12/9/2010 2:58 AM, Steven D'Aprano wrote:
    On Wed, 08 Dec 2010 20:16:57 -0800, John Nagle wrote:

    Here's an example where this issue produces invalid results in
    Python.
    NaN = float("nan")
    arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0,
    3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
    sorted(arr)
    [0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
    6.0, nan, 4.0, nan, 6.0, nan]

    The sorted numerical values aren't in order. Note the 4.0 near the end,
    after the 6.0. "sort" has failed because it assumes that a< b and b<
    c implies a< c. But that's not a valid assumption here.
    This is transitivity.
    It's not good to break trichotomy.
    I believe that is that exactly one of <,=.> are true.



    --
    Terry Jan Reedy
  • John Nagle at Dec 9, 2010 at 5:34 pm

    On 12/9/2010 12:36 AM, Terry Reedy wrote:
    On 12/9/2010 2:58 AM, Steven D'Aprano wrote:
    On Wed, 08 Dec 2010 20:16:57 -0800, John Nagle wrote:
    I believe that is that exactly one of <,=.> are true.
    Not for NaNs.
    NaN = float('nan')
    NaN == NaN
    False
    NaN > NaN
    False
    NaN < NaN
    False

    That's IEEE 754 compliant. NaN is not equal to NaN.
    That's by design. But it really messes up sorting.

    Python "dict" types, however, treat "NaN" as a unique value,
    because they're hash based on the underlying representation.

    (The best coverage of this whole topic was the Apple Numerics Manual
    for the original Mac. Apple hired the floating point expert from
    Berkeley to get this right. Then Apple went from the M68xxx series
    to the IBM PowerPC, and 80-bit floats to 64-bit floats, breaking all
    the engineering applications, most of which were never ported to the
    PowerPC.)

    John Nagle
  • Steven D'Aprano at Dec 9, 2010 at 11:23 pm

    On Thu, 09 Dec 2010 09:34:19 -0800, John Nagle wrote:

    (The best coverage of this whole topic was the Apple Numerics Manual
    for the original Mac. Apple hired the floating point expert from
    Berkeley to get this right. Then Apple went from the M68xxx series to
    the IBM PowerPC, and 80-bit floats to 64-bit floats, breaking all the
    engineering applications, most of which were never ported to the
    PowerPC.)
    I second John's recommendation re the Apple Numerics Manual. Even if the
    Apple specific stuff is obsolete, it's a great resource for understanding
    floating point issues.


    --
    Steven
  • Mark Wooding at Dec 9, 2010 at 12:21 pm

    John Nagle <nagle at animats.com> writes:

    NaN = float("nan")
    arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0,
    3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
    sorted(arr)
    [0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0, 6.0,
    nan, 4.0, nan, 6.0, nan]

    The sorted numerical values aren't in order.
    Indeed. You failed to provide a valid ordering to `sorted'. By failing
    to satisfy its precondition, you relieved it of its obligation to
    satisfy its postcondition.
    "sort" has failed because it assumes that a < b and b < c implies a <
    c. But that's not a valid assumption here.

    It's not good to break trichotomy.
    You're confused. The property

    a < b and b < c => a < c

    is called `transitivity'. But the `float' ordering /is/ transitive.
    Note that a < b implies that neither a nor b is a NaN.

    Also, trichotomy is unnecessary for sorting, and Python's `sort' method
    doesn't require it; it does require a total preorder, though. What
    properties does a total preorder require?

    1. Transitivity: a <= b and b <= c => a <= c

    2. Totality: a <= b or b <= a

    The above list sorting goes wrong because the `float' ordering isn't
    total. In particular, x </= NaN and NaN </= x for all x (including x =
    NaN!).

    So, your last remark is in the right direction (though strict trichotomy
    is unnecessary, as I've mentioned), but apparently only by accident
    since the preceding discussion is completely wrong.

    -- [mdw]
  • Steven D'Aprano at Dec 9, 2010 at 11:11 pm

    On Thu, 09 Dec 2010 12:21:45 +0000, Mark Wooding wrote:

    John Nagle <nagle at animats.com> writes:
    "sort" has failed because it assumes that a < b and b < c implies a <
    c. But that's not a valid assumption here.

    It's not good to break trichotomy.
    You're confused. The property

    a < b and b < c => a < c

    is called `transitivity'.
    Yes, but I believe that John is referring to the trichotomy (like a
    dichotomy, only there are three options instead of two) that exactly one
    of these relations are true:

    a < b
    a == b
    a > b

    It is true for real numbers, but not for IEEE floats, since none of the
    three are true if either a or b are a NAN. (Non-IEEE floats could do
    anything...)


    [...]
    2. Totality: a <= b or b <= a

    The above list sorting goes wrong because the `float' ordering isn't
    total. In particular, x </= NaN and NaN </= x for all x (including x =
    NaN!).
    I believe this is equivalent to trichotomy.

    So, your last remark is in the right direction (though strict trichotomy
    is unnecessary, as I've mentioned), but apparently only by accident
    since the preceding discussion is completely wrong.
    "Completely" is surely a tad strong -- John might not be using the exact
    same terminology as you, but he's clearly talking about the equivalent
    concepts. He wants and expects all data types to either meet a total
    order, or raise an exception to any of the < > <= and >= operators.


    --
    Steven
  • Mark Wooding at Dec 10, 2010 at 1:55 pm

    Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
    On Thu, 09 Dec 2010 12:21:45 +0000, Mark Wooding wrote:

    John Nagle <nagle at animats.com> writes:
    "sort" has failed because it assumes that a < b and b < c implies a <
    c. But that's not a valid assumption here.

    It's not good to break trichotomy.
    You're confused. The property

    a < b and b < c => a < c

    is called `transitivity'.
    Yes, but I believe that John is referring to the trichotomy (like a
    dichotomy,
    Then why did he say `it assumes that a < b and b < c implies a < c'?
    This assumption is transitivity, not trichotomy.
    2. Totality: a <= b or b <= a

    The above list sorting goes wrong because the `float' ordering isn't
    total. In particular, x </= NaN and NaN </= x for all x (including x =
    NaN!).
    I believe this is equivalent to trichotomy.
    No, it isn't. In particular, the definition of totality I gave above
    allows a <= b and b <= a and a /= b. You gave a perfectly good
    definition of trichotomy, which I have moved out of its original order:
    exactly one of these relations are true:

    a < b
    a == b
    a > b
    A total preorder (as defined above) doesn't exhibit this property -- but
    can be described in terms of a total order (which /does/ exhibit proper
    trichotomy) applied to a set of equivalence classes.
    So, your last remark is in the right direction (though strict trichotomy
    is unnecessary, as I've mentioned), but apparently only by accident
    since the preceding discussion is completely wrong.
    "Completely" is surely a tad strong -- John might not be using the
    exact same terminology as you, but he's clearly talking about the
    equivalent concepts. He wants and expects all data types to either
    meet a total order, or raise an exception to any of the < > <= and >=
    operators.
    No. He was hopelessly confused, describing the problem in terms of a
    failure of transitivity (which doesn't, in fact, fail), but using the
    word `trichotomy', apparently more by luck than judgement.

    I don't want to insist on a total order. Efficient sorting requires a
    total preorder, and that's all I want.

    -- [mdw]
  • Steven D'Aprano at Dec 8, 2010 at 10:29 am

    On Tue, 07 Dec 2010 15:08:23 -0800, John Nagle wrote:

    If you're thinking hard about this, I recommend viewing Alexander
    Stepanov's talk at Stanford last month:

    http://www.stanford.edu/class/ee380/Abstracts/101103.html

    He makes the point that, for generic programs to work right, the basic
    operations must have certain well-defined semantics. Then the same
    algorithms will work right across a wide variety of objects.
    But they already work right across a wide variety of objects, so long as
    you limit yourself to the subset of objects where the basic operations
    have the same semantics.

    I think that insisting that all operators must always have the same
    semantics is as impractical and unnecessary as insisting that all
    functions and methods with the same name must always have the same
    semantics. We wouldn't expect

    pencil.draw
    six_shooter.draw
    game.draw

    to all have the same semantics, or

    math.sin
    priest.sin

    despite the inconvenience it makes to duck-typing. Why should we expect
    more from operators than we expect from named functions, when there are
    so many more named functions and so few useful symbols for operators?

    To my mind, it is foolish for us to expect x*y to always have the same
    semantics when even mathematicians don't expect that. In pure
    mathematics, x*y != y*x for any of the following:

    matrices
    quaternions
    octonions

    and probably many others I don't know about.

    This is consistent with Python's "duck typing", but inconsistent
    with the current semantics of some operators.

    For example, "+" as concatenation makes "+" non-commutative.
    No, it only makes + non-commutative for those types where + is non-
    commutative.

    In other words,

    a + b

    is not always equal to

    b + a

    which is not good.
    I don't see why. It seems to me that it's only a bad thing if you hope to
    reason about the meaning of a+b without knowing what a and b actually are.

    Personally, I don't consider that a particularly useful trait.

    Important properties to have across all types:

    a + b == b + a

    Exactly one of

    a > b
    a = b
    a < b

    is true, or an type exception must be raised.
    As Mark Wooding has already pointed out, that would make numeric
    programmers mad, as it eliminates NANs, which are far more important to
    them. And me.

    It also would make it impossible to use > and < to talk about rankings in
    natural hierarchies, such as (say) pecking orders. Using > to mean "out-
    ranks", you might have a pecking order among five hens like this:

    A > B > C > D > E

    but

    D > B

    Not all comparisons are equivalence relations, and it would be a crying
    shame to lose the ability to use > and < to discuss (e.g.) non-transitive
    comparisons.



    --
    Steven
  • Carl Banks at Dec 7, 2010 at 11:20 pm

    On Dec 6, 4:17?pm, Steven D'Aprano <steve +comp.lang.pyt... at pearwood.info> wrote:
    On Mon, 06 Dec 2010 08:59:12 -0800, TomF wrote:
    I'm aggravated by this behavior in python:
    x = "4"
    print x < 7 ? ?# prints False
    I can't imagine why this design decision was made.
    You've never needed to deal with an heterogeneous list?

    data = ["Fred", "Barney", 2, 1, None]
    data.sort()
    Not once, ever.

    Nevertheless, I agree that in hindsight, the ability to sort such lists
    is not as important as the consistency of comparisons.
    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.

    If the order of the items doesn't matter, then there must be some
    small bit of homogeneity to exploit to use as a sort criterion. In
    that case you should use key= parameter or DSU.


    Carl Banks
  • Mark Wooding at Dec 8, 2010 at 12:09 am

    Carl Banks <pavlovevidence at gmail.com> writes:

    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.
    Here's a reason you might want to.

    You're given an object, and you want to compute a hash of it. (Maybe
    you want to see whether someone else's object is the same as yours, but
    don't want to disclose the actual object, say.) To hash it, you'll need
    to serialize it somehow. But here's a problem: objects like
    dictionaries and sets don't impose an ordering on their elements. For
    example, the set { 1, 'two' } is the same as the set { 'two', 1 } -- but
    iterating the two might well yield the elements in a different order.
    (The internal details of a hash table tend to reflect the history of
    operations on the hash table as well as its current contents.)

    The obvious answer is to apply a canonical ordering to unordered objects
    like sets and dictionaries. A set can be serialized with its elements
    in ascending order; a dictionary can be serialized as key/value pairs
    with the keys in ascending order. But to do this, you need an
    (arbitrary, total) order on all objects which might be set elements or
    dictionary keys. The order also needs to be dependent only on the
    objects' serializable values, and not on any incidental facts such as
    memory addresses or whatever.

    -- [mdw]
  • TomF at Dec 8, 2010 at 3:39 am

    On 2010-12-07 16:09:17 -0800, Mark Wooding said:

    Carl Banks <pavlovevidence at gmail.com> writes:
    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.
    Here's a reason you might want to.

    You're given an object, and you want to compute a hash of it. (Maybe
    you want to see whether someone else's object is the same as yours, but
    don't want to disclose the actual object, say.) To hash it, you'll need
    to serialize it somehow. But here's a problem: objects like
    dictionaries and sets don't impose an ordering on their elements. For
    example, the set { 1, 'two' } is the same as the set { 'two', 1 } -- but
    iterating the two might well yield the elements in a different order.
    (The internal details of a hash table tend to reflect the history of
    operations on the hash table as well as its current contents.)

    The obvious answer is to apply a canonical ordering to unordered objects
    like sets and dictionaries. A set can be serialized with its elements
    in ascending order; a dictionary can be serialized as key/value pairs
    with the keys in ascending order. But to do this, you need an
    (arbitrary, total) order on all objects which might be set elements or
    dictionary keys. The order also needs to be dependent only on the
    objects' serializable values, and not on any incidental facts such as
    memory addresses or whatever.
    I have no argument that there might be an extra-logical use for such an
    ordering which you might find convenient. This is the point you're
    making. sort() and sorted() both take a cmp argument for this sort of
    thing. My complaint is with Python adopting nonsensical semantics
    ("shoe" < 7) to accomodate it.

    By analogy, I often find it convenient to have division by zero return
    0 to the caller for use in calculations. But if Python defined 0/0==0
    I'd consider it broken.

    -Tom
  • Ben Finney at Dec 8, 2010 at 12:58 am

    Carl Banks <pavlovevidence at gmail.com> writes:

    On Dec 6, 4:17?pm, Steven D'Aprano <steve
    +comp.lang.pyt... at pearwood.info> wrote:
    Nevertheless, I agree that in hindsight, the ability to sort such
    lists is not as important as the consistency of comparisons.
    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.
    It can also be indicative of code written for a Python that doesn't have
    sets.

    Comparing two list objects to see whether they have the same items in
    any sequence can be done with::

    set(foolist) == set(barlist)

    but, if there is no ?set? type, it's fine to write::

    sorted(foolist) == sorted(barlist)

    So there's no design error in wanting heterogenerous sequences to sort;
    it can be quite Pythonic (until the advent of the ?set? type).

    And, of course, that code needs to continue to work in Python 2.x;
    hence, the comparison of incompatible types cannot be fixed without
    breaking backward compatibility. Hence it's not fixed until Python 3.x.

    --
    \ ?All opinions are not equal. Some are a very great deal more |
    `\ robust, sophisticated and well supported in logic and argument |
    _o__) than others.? ?Douglas Adams |
    Ben Finney
  • Paul Rubin at Dec 8, 2010 at 1:04 am

    Ben Finney <ben+python at benfinney.id.au> writes:
    but, if there is no ?set? type, it's fine to write::
    sorted(foolist) == sorted(barlist)
    what about dictionaries?
  • Ben Finney at Dec 8, 2010 at 1:11 am

    Paul Rubin <no.email at nospam.invalid> writes:

    Ben Finney <ben+python at benfinney.id.au> writes:
    but, if there is no ?set? type, it's fine to write::
    sorted(foolist) == sorted(barlist)
    what about dictionaries?
    Creating a needless dict for each list would make the code even less
    clear, I'd think. (We're talking here about design, which is why clarity
    of communicating semantic intent is a concern.)

    Of course, the ?set? type is an even clearer way of communicating the
    intent; so, the above doesn't need to work in Python 3, which can break
    code written before the advent of the ?set? type.

    --
    \ ?But Marge, what if we chose the wrong religion? Each week we |
    `\ just make God madder and madder.? ?Homer, _The Simpsons_ |
    _o__) |
    Ben Finney
  • Steven D'Aprano at Dec 8, 2010 at 9:47 am

    On Wed, 08 Dec 2010 11:58:03 +1100, Ben Finney wrote:

    Carl Banks <pavlovevidence at gmail.com> writes:
    On Dec 6, 4:17?pm, Steven D'Aprano <steve
    +comp.lang.pyt... at pearwood.info> wrote:
    Nevertheless, I agree that in hindsight, the ability to sort such
    lists is not as important as the consistency of comparisons.
    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.
    It can also be indicative of code written for a Python that doesn't have
    sets.
    Or a list that contains unhashable objects.

    Or a list that needs to be presented to a human reader in some arbitrary
    but consistent order.

    Or a doctest that needs to show the keys in a dict:
    d = myfunction()
    sorted(d.keys())
    ['ham', 'spam', 42, None]

    (although that case is probably the weakest of the three).


    So there's no design error in wanting heterogenerous sequences to sort;
    it can be quite Pythonic (until the advent of the ?set? type).
    Agreed, but in hindsight I think it would be better if there was a
    separate lexicographic sort function, that guaranteed to sort anything
    (including such unorderable values as complex numbers!), without relying
    on the vagaries of the standard comparison operators.

    Or at least anything printable, in which case sorted() with a key
    function of lambda obj: (repr(type(obj)), repr(obj)) might work, I
    suppose...

    Then at least we could limit our arguments to how this hypothetical
    lexicographic sort function was broken, instead of how all comparison
    operators are broken :)



    --
    Steven
  • Tim Chase at Dec 8, 2010 at 12:29 pm

    On 12/08/2010 03:47 AM, Steven D'Aprano wrote:
    Or a list that needs to be presented to a human reader in some arbitrary
    but consistent order.

    Or a doctest that needs to show the keys in a dict:
    d = myfunction()
    sorted(d.keys())
    ['ham', 'spam', 42, None] [snip]
    Agreed, but in hindsight I think it would be better if there was a
    separate lexicographic sort function, that guaranteed to sort anything
    (including such unorderable values as complex numbers!), without relying
    on the vagaries of the standard comparison operators.
    wouldn't that be something like

    sorted(mixedstuff, key=str)

    or if all you need is a stable order regardless of what that
    order is, one could even get away with:

    sorted(mixedstuff, key=id)

    -tkc
  • John Nagle at Dec 8, 2010 at 10:49 pm

    On 12/8/2010 1:47 AM, Steven D'Aprano wrote:
    On Wed, 08 Dec 2010 11:58:03 +1100, Ben Finney wrote:

    Carl Banks<pavlovevidence at gmail.com> writes:
    On Dec 6, 4:17 pm, Steven D'Aprano<steve
    +comp.lang.pyt... at pearwood.info> wrote:
    Nevertheless, I agree that in hindsight, the ability to sort such
    lists is not as important as the consistency of comparisons.
    I think that feeling the need to sort non-homogenous lists is
    indicative of bad design.
    It can also be indicative of code written for a Python that doesn't have
    sets.
    Or a list that contains unhashable objects.
    If you can't hash it, and it doesn't have some definition of
    comparison associated with the object, you probably can't order
    it properly, either.

    "<" can't be some random function. For sorting to work,

    a < b and b < c implies a < c

    must hold.

    John Nagle
  • BartC at Dec 8, 2010 at 1:24 am
    "Carl Banks" <pavlovevidence at gmail.com> wrote in message
    news:bf4be9a7-a079-4454-9969-60e9be305a03 at k14g2000pre.googlegroups.com...
    On Dec 6, 4:17 pm, Steven D'Aprano <steve
    +comp.lang.pyt... at pearwood.info> wrote:
    On Mon, 06 Dec 2010 08:59:12 -0800, TomF wrote:
    I'm aggravated by this behavior in python:
    x = "4"
    print x < 7 # prints False
    I can't imagine why this design decision was made.
    You've never needed to deal with an heterogeneous list?

    data = ["Fred", "Barney", 2, 1, None]
    data.sort()
    Not once, ever.

    Nevertheless, I agree that in hindsight, the ability to sort such lists
    is not as important as the consistency of comparisons.
    I think that feeling the need to sort non-homogenous lists is
    indictative of bad design.
    Using a simple "<" comparison, perhaps. But can't a list be sorted by other
    criteria? For example, by comparing the string representations of each
    element.

    So some sorts will make sense, and others (such as "<" or ">") won't.

    --
    Bartc

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedDec 6, '10 at 4:59p
activeDec 10, '10 at 1:55p
posts33
users14
websitepython.org

People

Translate

site design / logo © 2022 Grokbase