Discussion:
[m-dev.] [m-users.] Comparison in the reference manual
Julien Fischer
2015-11-15 22:38:28 UTC
Permalink
Hi Mark,
Moved to developers list...
While answering a question on IRC I tried to find the relevant part in the
reference manual for comparison. Specifically that the order of functors in
a type declaration is significant. But I could not find it.
Regarding the more general question, the standard orderings for char
and string are documented as system-dependent in the corresponding
library modules.
I'm quite sure ints and floats were intended to have the mathematical
ordering, with the usual caveats about floating point equality; I know
I have written code that relies on this. But it's undocumented and
thus undefined, strictly speaking.
Tuples are undocumented. Comparison for pred and func types can't be
implemented, so the point is moot.
I'm wondering, can the language define a bit more than it has thus
far?
I think that's a good idea; I suspect most users have been assuming more
than that language actually guarantees anyway.
The idea would be to provide some useful theorems, while being
careful not to constrain implementations too much. Here are some
int: Define it as having the mathematical ordering, and require
implementations to disallow overflowing literals. The latter is so you
don't have 9223372036854775808 < 0 being true. (You may still have
9223372036854775807 + 1 < 0, since the + operation is undefined on
overflow.)
float: We can at least say that < and > never contradict the
mathematical ordering, even if = can due to lack of precision. Can we
assume implementations may use +/- infinity?
If we are assuming IEEE 754 floats, yes. For all of our current
backends that is definitely yes.
Otherwise a caveat on overflowing float literals may be needed.
(Incidentally, for an overflowing float literal I get an error from
the C compiler about infinity being undeclared.)
Do you mean along the lines of bug #146? (The intended fix is that
the compiler should replace overflowing float literals with calls to
float.infinity/0 or -float.infinity/0.)
char: Are these officially unicode now?
Yes.
We could define them as having the code point ordering.
string: There are probably good reasons to keep this ordering
undefined. But maybe we could say something about the ascii subset?
tuples: These exist as a convenience to the user. Would defining them
as having the obvious ordering be likely ever to cause problems for
implementations?
No, I can't see that it would cause any problems.

Julien.
Mark Brown
2015-11-16 13:21:13 UTC
Permalink
Post by Julien Fischer
Hi Mark,
Moved to developers list...
I'm wondering, can the language define a bit more than it has thus
far?
I think that's a good idea; I suspect most users have been assuming more
than that language actually guarantees anyway.
Attached diff is for review. Preferably by everyone, since it changes
the language and is a small diff.

The actual changes are still up for discussion.
Post by Julien Fischer
The idea would be to provide some useful theorems, while being
careful not to constrain implementations too much. Here are some
int: Define it as having the mathematical ordering, and require
implementations to disallow overflowing literals. The latter is so you
don't have 9223372036854775808 < 0 being true. (You may still have
9223372036854775807 + 1 < 0, since the + operation is undefined on
overflow.)
float: We can at least say that < and > never contradict the
mathematical ordering, even if = can due to lack of precision. Can we
assume implementations may use +/- infinity?
If we are assuming IEEE 754 floats, yes. For all of our current
backends that is definitely yes.
For ints and floats I've decided to keep it conservative, and just say
that implementations are allowed to be inconsistent if literals are
out-of-range.
Post by Julien Fischer
Otherwise a caveat on overflowing float literals may be needed.
(Incidentally, for an overflowing float literal I get an error from
the C compiler about infinity being undeclared.)
Do you mean along the lines of bug #146? (The intended fix is that
the compiler should replace overflowing float literals with calls to
float.infinity/0 or -float.infinity/0.)
Yes.
Post by Julien Fischer
char: Are these officially unicode now?
Yes.
We could define them as having the code point ordering.
string: There are probably good reasons to keep this ordering
undefined. But maybe we could say something about the ascii subset?
tuples: These exist as a convenience to the user. Would defining them
as having the obvious ordering be likely ever to cause problems for
implementations?
No, I can't see that it would cause any problems.
Likewise for discriminated union types where both values have the same
principal constructor, I suppose. I've documented that too, but I'm
not so sure about that one. If anyone objects, please speak up.

Mark
Paul Bone
2015-11-16 23:57:15 UTC
Permalink
Post by Mark Brown
Post by Julien Fischer
Hi Mark,
Moved to developers list...
I'm wondering, can the language define a bit more than it has thus
far?
I think that's a good idea; I suspect most users have been assuming more
than that language actually guarantees anyway.
Attached diff is for review. Preferably by everyone, since it changes
the language and is a small diff.
The actual changes are still up for discussion.
This makes sense to me. I see neither any sagnificant issues or trivial
errors (spelling). I'd be happy with this but since it's a language
definition thing I'd suggest getting 1-2 more votes before committing it.

Cheers.
--
Paul Bone
Julien Fischer
2015-11-23 00:10:08 UTC
Permalink
Hi Mark,
Post by Mark Brown
Post by Julien Fischer
float: We can at least say that < and > never contradict the
mathematical ordering, even if = can due to lack of precision. Can we
assume implementations may use +/- infinity?
If we are assuming IEEE 754 floats, yes. For all of our current
backends that is definitely yes.
For ints and floats I've decided to keep it conservative, and just say
that implementations are allowed to be inconsistent if literals are
out-of-range.
I think that's too conservative. The implementation should just reject
int literals that will overflow (i.e. what is does now). For systems
with IEEE floats we should replace overflows with +/- infinity. What to
do on systems that don't provide IEEE fp is a more open question, but we
don't currently have any of those.

Question: should we just require the use of IEEE floats for Mercury?
Post by Mark Brown
diff --git a/doc/reference_manual.texi b/doc/reference_manual.texi
index f5c7e51..971f938 100644
--- a/doc/reference_manual.texi
+++ b/doc/reference_manual.texi
...
Post by Mark Brown
+the standard ordering approximates the usual numerical ordering.
I see what you did there!
Post by Mark Brown
+then this relation holds in the numerical ordering,
Ignoring the matter of NaNs for a moment, the order for floats should be
specified as it is in, for example, Java and C#:

-infinity < -max < ... < -min < -0.0 == +0.0 < +min < ... < +max < +infinity


Not-a-number (NaN) values complicate matters. The current implementation of
builtin.compare/3 is broken, for example the following evaluates to true:

NaN = 0.0 * infinity,
compare((=), NaN, 0.0)

but:

NaN = 0.0 * infinity,
unify(NaN, 0.0)

evaluates to false.

The IEEE 754-2008 standard (section 5.10) defines a total ordering on float
values, but I don't think that will work for us. Java's Double.CompareTo
method treats NaN values as equal to themselves and greater than all other
float values (including +infinity), mainly so that you can use them in data
structures. IIRC, C# does something similar.

My inclination is that we should simply make builtin.compare/3 throw an
exception for NaN values on the basis that they are not ordered w.r.t other
float values (or indeed themselves).

...
Post by Mark Brown
+the numerical ordering of the Unicode code point values.
For completeness, you should say something about strings here too (e.g.
what's currently in the comment at the head of the string module.)

Julien.
Mark Brown
2015-11-23 11:56:46 UTC
Permalink
Post by Julien Fischer
Hi Mark,
Post by Mark Brown
Post by Julien Fischer
float: We can at least say that < and > never contradict the
mathematical ordering, even if = can due to lack of precision. Can we
assume implementations may use +/- infinity?
If we are assuming IEEE 754 floats, yes. For all of our current
backends that is definitely yes.
For ints and floats I've decided to keep it conservative, and just say
that implementations are allowed to be inconsistent if literals are
out-of-range.
I think that's too conservative. The implementation should just reject
int literals that will overflow (i.e. what is does now). For systems
with IEEE floats we should replace overflows with +/- infinity. What to
do on systems that don't provide IEEE fp is a more open question, but we
don't currently have any of those.
Okay. Note that I have used "should", and not "may" or "must". Just
checking that this is what you meant.
Post by Julien Fischer
Ignoring the matter of NaNs for a moment, the order for floats should be
-infinity < -max < ... < -min < -0.0 == +0.0 < +min < ... < +max < +infinity
Done.
Post by Julien Fischer
Not-a-number (NaN) values complicate matters. The current implementation of
NaN = 0.0 * infinity,
compare((=), NaN, 0.0)
NaN = 0.0 * infinity,
unify(NaN, 0.0)
evaluates to false.
The IEEE 754-2008 standard (section 5.10) defines a total ordering on float
values, but I don't think that will work for us. Java's Double.CompareTo
method treats NaN values as equal to themselves and greater than all other
float values (including +infinity), mainly so that you can use them in data
structures. IIRC, C# does something similar.
My inclination is that we should simply make builtin.compare/3 throw an
exception for NaN values on the basis that they are not ordered w.r.t other
float values (or indeed themselves).
Fine by me.
Post by Julien Fischer
For completeness, you should say something about strings here too (e.g.
what's currently in the comment at the head of the string module.)
Done.

Additional diff is attached. I'll commit the change soon if there are
no further comments.

Thanks,
Mark
Julien Fischer
2015-11-28 11:23:57 UTC
Permalink
Hi Mark,
Post by Mark Brown
Post by Julien Fischer
Not-a-number (NaN) values complicate matters. The current implementation of
NaN = 0.0 * infinity,
compare((=), NaN, 0.0)
NaN = 0.0 * infinity,
unify(NaN, 0.0)
evaluates to false.
The IEEE 754-2008 standard (section 5.10) defines a total ordering on float
values, but I don't think that will work for us. Java's Double.CompareTo
method treats NaN values as equal to themselves and greater than all other
float values (including +infinity), mainly so that you can use them in data
structures. IIRC, C# does something similar.
My inclination is that we should simply make builtin.compare/3 throw an
exception for NaN values on the basis that they are not ordered w.r.t other
float values (or indeed themselves).
Fine by me.
The handling of NaN values in Mercury is spectacularly broken at the moment,
but that's another matter.
Post by Mark Brown
Post by Julien Fischer
For completeness, you should say something about strings here too (e.g.
what's currently in the comment at the head of the string module.)
Done.
Additional diff is attached. I'll commit the change soon if there are
no further comments.
...
Post by Mark Brown
diff --git a/doc/reference_manual.texi b/doc/reference_manual.texi
index 64a72e3..ba43b40 100644
--- a/doc/reference_manual.texi
+++ b/doc/reference_manual.texi
@@ -2543,20 +2543,32 @@ As such, the standard ordering for most types is not fully defined.
the standard ordering is the usual numerical ordering.
-Implementations are permitted to give inconsistent results
-for overflowing literals.
+Implementations should reject code containing overflowing integer literals.
the standard ordering approximates the usual numerical ordering.
then this relation holds in the numerical ordering,
-Implementations are permitted to give inconsistent results
-for overflowing literals.
+In the standard ordering, ``negative'' and ``positive'' zero values are equal.
+Implementations should replace overflowing literals
+with the infinity of the same sign;
+in the standard ordering positive infinity is greater than all finite values
+and negative infinity is less than all finite values.
+Implementations must throw an exception when comparing
+a ``not a number'' (NaN) value.
the numerical ordering of the Unicode code point values.
+the standard ordering is implementation dependent.
+The current implementation performs string comparison using
+when compiling to C, Java and C#, respectively.
That list (and for that matter the library module) omit the Erlang backend.

It's probably also worth note that the implementation is not required to
normalize strings before comparing them.

Julien.

Continue reading on narkive:
Loading...