Discussion:
[m-dev.] proposal: remove hlc_nest and hl_nest grades
Julien Fischer
2017-08-31 03:25:57 UTC
Permalink
Hi,

Are there any objections to removing support for the hlc_nest( and
hl_nest* grades? (i.e. high-level C with GCC nested functions).

While they were (presuambly) useful in the past for evaluating various
design decisions in the MLDS backend, they haven't been used (or even
documented) for quite a while now.

Additionally, GCC's support for nested functions was a bit flakey last I
looked, so I can't really see anyone ever wanting to use these grades in
anger.

Julien.
Zoltan Somogyi
2017-08-31 03:48:06 UTC
Permalink
Post by Julien Fischer
Are there any objections to removing support for the hlc_nest( and
hl_nest* grades? (i.e. high-level C with GCC nested functions).
While they were (presuambly) useful in the past for evaluating various
design decisions in the MLDS backend, they haven't been used (or even
documented) for quite a while now.
Additionally, GCC's support for nested functions was a bit flakey last I
looked, so I can't really see anyone ever wanting to use these grades in
anger.
I haven't used them in anger at all; when I did use them, it was just to check
they still worked (a long time ago).

However, if they do work right now, it would be best to see how fast they are
compared to their non-nested versions. IF they are faster, and IF there is still
some prospect of gcc resolving the flakiness (about which I know nothing),
I would prefer not to delete them just yet. Otherwise ...

By the way, does anyone have a machine that is good for benchmarking?
I would do the test above, but my laptop does thermal throttling in a way
that is bad for timing repeatability. I have seen two successive runs of
the exact same task (in tools/speedtest) return times as different as 20s and 30s.
This means that to get any information, I need a lot of runs (to average out
the variability), and even then, the results I get are not very precise.

Do the Java and C# backends use nested functions? If not, and if we *do* delete
the hlc_nest grades, then we should simplify the MLDS backend. At the moment,
we always generate MLDS code with nested functions, which ml_elim_nested.m
then hoists out. Having the code generator generate the functions in their non-nested
form directly would be simpler and faster, partly because it wouldn't have to cater
to ml_elim_nested.m's limitations.

Zoltan.
Julien Fischer
2017-08-31 04:13:25 UTC
Permalink
Hi Zoltan,
Post by Zoltan Somogyi
Post by Julien Fischer
Are there any objections to removing support for the hlc_nest( and
hl_nest* grades? (i.e. high-level C with GCC nested functions).
While they were (presuambly) useful in the past for evaluating various
design decisions in the MLDS backend, they haven't been used (or even
documented) for quite a while now.
Additionally, GCC's support for nested functions was a bit flakey last I
looked, so I can't really see anyone ever wanting to use these grades in
anger.
I haven't used them in anger at all; when I did use them, it was just to check
they still worked (a long time ago).
However, if they do work right now, it would be best to see how fast they are
compared to their non-nested versions. IF they are faster, and IF there is still
some prospect of gcc resolving the flakiness (about which I know nothing),
Back (at uni) we used to test them regularly; there was series of
problems with those grades and the early GCC 4.x releases. (I have no
idea what the current state is ...)
Post by Zoltan Somogyi
I would prefer not to delete them just yet. Otherwise ...
By the way, does anyone have a machine that is good for benchmarking?
I'll discuss this with you off-list.
Post by Zoltan Somogyi
I would do the test above, but my laptop does thermal throttling in a way
that is bad for timing repeatability. I have seen two successive runs of
the exact same task (in tools/speedtest) return times as different as 20s and 30s.
This means that to get any information, I need a lot of runs (to average out
the variability), and even then, the results I get are not very precise.
Do the Java and C# backends use nested functions?
No. (The internal option that enables nested functions is named
gcc_nested_functions with good reason.)
Post by Zoltan Somogyi
If not, and if we *do* delete the hlc_nest grades, then we should
simplify the MLDS backend. At the moment, we always generate MLDS code
with nested functions, which ml_elim_nested.m then hoists out. Having
the code generator generate the functions in their non-nested form
directly would be simpler and faster, partly because it wouldn't have
to cater to ml_elim_nested.m's limitations.
That's one of the reasons I suggested removing them. In fact, IIRC
ml_elim_nested.m is actually a bit of a performance bottleneck on
large programs.

Julien.
Zoltan Somogyi
2017-11-14 06:11:32 UTC
Permalink
Post by Julien Fischer
Post by Zoltan Somogyi
If not, and if we *do* delete the hlc_nest grades, then we should
simplify the MLDS backend. At the moment, we always generate MLDS code
with nested functions, which ml_elim_nested.m then hoists out. Having
the code generator generate the functions in their non-nested form
directly would be simpler and faster, partly because it wouldn't have
to cater to ml_elim_nested.m's limitations.
That's one of the reasons I suggested removing them. In fact, IIRC
ml_elim_nested.m is actually a bit of a performance bottleneck on
large programs.
I have been thinking about generating flattened MLDS code directly, not
via ml_elim_nested.m. It looks to be far from easy, and I just measured
ml_elim_nested as taking about 1% of the compiler's runtime on tools/speedtest.
At that rate, that change does not seem to be all that worthwhile. Do you,
or anyone else, have some test cases for which the compiler takes substantially
longer in ml_elim_nested.m?

Zoltan.
Julien Fischer
2017-11-14 06:29:04 UTC
Permalink
Hi Zoltan,
Post by Zoltan Somogyi
Post by Julien Fischer
Post by Zoltan Somogyi
If not, and if we *do* delete the hlc_nest grades, then we should
simplify the MLDS backend. At the moment, we always generate MLDS code
with nested functions, which ml_elim_nested.m then hoists out. Having
the code generator generate the functions in their non-nested form
directly would be simpler and faster, partly because it wouldn't have
to cater to ml_elim_nested.m's limitations.
That's one of the reasons I suggested removing them. In fact, IIRC
ml_elim_nested.m is actually a bit of a performance bottleneck on
large programs.
I have been thinking about generating flattened MLDS code directly, not
via ml_elim_nested.m. It looks to be far from easy, and I just measured
ml_elim_nested as taking about 1% of the compiler's runtime on tools/speedtest.
At that rate, that change does not seem to be all that worthwhile. Do you,
or anyone else, have some test cases for which the compiler takes substantially
longer in ml_elim_nested.m?
Not currently from the looks of it. What I was referring to above was
based on the compilation times of code generated by the Zinc compiler a
number of years ago. The current version of the Zinc compiler now
generates different code and doesn't appear to have any issues with the
elimination of nested functions. (I will have a dig about and see if I
can find something, but don't hold your breath on that ...)

Julien.
Zoltan Somogyi
2017-11-14 06:44:26 UTC
Permalink
Post by Julien Fischer
Hi Zoltan,
Post by Zoltan Somogyi
Post by Julien Fischer
Post by Zoltan Somogyi
If not, and if we *do* delete the hlc_nest grades, then we should
simplify the MLDS backend. At the moment, we always generate MLDS code
with nested functions, which ml_elim_nested.m then hoists out. Having
the code generator generate the functions in their non-nested form
directly would be simpler and faster, partly because it wouldn't have
to cater to ml_elim_nested.m's limitations.
That's one of the reasons I suggested removing them. In fact, IIRC
ml_elim_nested.m is actually a bit of a performance bottleneck on
large programs.
I have been thinking about generating flattened MLDS code directly, not
via ml_elim_nested.m. It looks to be far from easy, and I just measured
ml_elim_nested as taking about 1% of the compiler's runtime on tools/speedtest.
At that rate, that change does not seem to be all that worthwhile. Do you,
or anyone else, have some test cases for which the compiler takes substantially
longer in ml_elim_nested.m?
Not currently from the looks of it. What I was referring to above was
based on the compilation times of code generated by the Zinc compiler a
number of years ago.
Yes, I remember that. I made a significant number of changes to eliminate
all the bottlenecks that made the compiler so slow in compiling Mercury code
generated from Zinc.
Post by Julien Fischer
The current version of the Zinc compiler now
generates different code and doesn't appear to have any issues with the
elimination of nested functions.
Given all those changes about 8-10 years ago, the former is not *necessarily*
the cause of the latter :-)

Zoltan.

Zoltan Somogyi
2017-09-06 12:54:04 UTC
Permalink
Post by Zoltan Somogyi
However, if they do work right now, it would be best to see how fast they are
compared to their non-nested versions. IF they are faster, and IF there is still
some prospect of gcc resolving the flakiness (about which I know nothing),
I would prefer not to delete them just yet. Otherwise ...
I have just checked on the hlc_nest.gc grade.

It does work; it fails only the test cases hlc.gc itself fails (the tailrec warning
cases). On my laptop, it is about 0.9% slower than hlc.gc; on a machine
in the cloud Julien gave me access to, it is 1.7% slower. The cloud machine
gave far more consistent individual times (about 1 second difference
between the slowest and fastest hlc_nest.gc times, vs 5+ seconds),
the latter is very likely closer to the truth.

On that basis, I have no objection to removing the "nest" grade component.

Julien, do you want to do this, or should I?
Post by Zoltan Somogyi
Having the code generator generate the functions in their non-nested
form directly would be simpler and faster, partly because it wouldn't have to cater
to ml_elim_nested.m's limitations.
Again: Julien, do you want to do this, or should I?

Zoltan.
Julien Fischer
2017-09-06 13:49:17 UTC
Permalink
Hi Zoltan,
Post by Zoltan Somogyi
Post by Zoltan Somogyi
However, if they do work right now, it would be best to see how fast they are
compared to their non-nested versions. IF they are faster, and IF there is still
some prospect of gcc resolving the flakiness (about which I know nothing),
I would prefer not to delete them just yet. Otherwise ...
I have just checked on the hlc_nest.gc grade.
It does work; it fails only the test cases hlc.gc itself fails (the tailrec warning
cases). On my laptop, it is about 0.9% slower than hlc.gc; on a machine
in the cloud Julien gave me access to, it is 1.7% slower. The cloud machine
gave far more consistent individual times (about 1 second difference
between the slowest and fastest hlc_nest.gc times, vs 5+ seconds),
the latter is very likely closer to the truth.
On that basis, I have no objection to removing the "nest" grade component.
Julien, do you want to do this, or should I?
Post by Zoltan Somogyi
Having the code generator generate the functions in their non-nested
form directly would be simpler and faster, partly because it wouldn't have to cater
to ml_elim_nested.m's limitations.
Again: Julien, do you want to do this, or should I?
If you want to go ahead and do both, feel free. Given the bits of the
compiler you've been working on recently you are almost certainly better
placed to make the second change than I am anyway.

Julien.
Zoltan Somogyi
2017-09-06 13:51:59 UTC
Permalink
Post by Julien Fischer
Post by Zoltan Somogyi
On that basis, I have no objection to removing the "nest" grade component.
Julien, do you want to do this, or should I?
Post by Zoltan Somogyi
Having the code generator generate the functions in their non-nested
form directly would be simpler and faster, partly because it wouldn't have to cater
to ml_elim_nested.m's limitations.
Again: Julien, do you want to do this, or should I?
If you want to go ahead and do both, feel free.
Ok, I will go ahead and do both.

Zoltan.
Loading...