Discussion:
[m-dev.] Determining which grades an installed Mercury compiler supports
Keri Harris
2018-07-18 16:28:48 UTC
Permalink
Hi,

I've run into a couple of interesting issues in how Mercury determines
which grades it supports. For C grades, the compiler checks for the
existence of mercury/modules/$grade/mer_std.c, for non-C grades just the
mercury/modules/$grade directory.

The erlang grade can have problems with this approach since init files
can be installed into the mercury/module/erlang directory. In
particular, a problem can arise when a 3rd-party library installs an
erlang init file, and then Mercury is later reinstalled without support
for the erlang grade; these residual 3rd-party init files can confuse
Mercury into believing that it still does support the erlang grade. The
following sequence of events leads to this:

* compile and install Mercury package such that grade erlang is installed
- /usr/lib/mercury/modules/erlang/mer_std.init is created

* compile and install mercury_fixed library from mercury/extras/fixed
- /usr/lib/mercury/modules/erlang/mercury_fixed.init is created

* recompile and install Mercury such that grade erlang is no longer
installed
- /usr/lib/mercury/modules/erlang/ still exists

Obviously this problem will only exist if the mercury std libraries and
3rd party libraries are installed into the same MERCURY_LIBDIR location.
Likewise, this problem will only exist if the erlang std libraries are
physically removed as part of the Mercury reinstall. However, both these
conditions will tend to be satisfied whenever Mercury is
installed/reinstalled via a package manager. For example, on Gentoo Linux:

% install Mercury with erlang grade
$ USE="erlang" emerge mercury
<snip>

$ mmc --output-libgrades
hlc.gc
erlang
reg.gc

% install libraries from mercury-extras
$ emerge mercury-extras
<snip>

$ ls /usr/lib/mercury/modules/erlang/
mercury_fixed.init mer_std.init

% reinstall Mercury _without_ erlang grade
$ USE="-erlang" emerge mercury
<snip>

$ mmc --output-libgrades
hlc.gc
erlang
reg.gc

$ ls /usr/lib/mercury/modules/erlang/
mercury_fixed.init

% attempt to reinstall mercury_fixed library from mercury-extras
$ emerge mercury-extras
<snip>
Installing grade erlang
Making Mercury/erlang/x86_64-pc-linux-gnu/Mercury/erls/fixed.erl
** Error: file
`Mercury/erlang/x86_64-pc-linux-gnu/Mercury/hrls/mercury__array.hrl' not
found: file `mercury__array.hrl' not found
gmake[2]: *** [Makefile:18: install] Error 1



The modules/java and modules/csharp directories seem immune to the above
problem as these grades don't appear to ever create init files. However,
even these grades can cause problems with package managers whenever
multiple packages share the same MERCURY_LIBDIR location. The
modules/java and modules/csharp directories will then be 'owned' by
multiple packages, and package managers will tend to delete empty
directories belonging to a package whenever that package is uninstalled.
Thus it's possible for the mercury compiler to lose track of which
grades it supports. For example on Gentoo Linux:

% install Mercury with java grade
$ USE="java" emerge mercury
<snip>
/usr/lib/mercury/modules/java/
<snip>

$ mmc --output-libgrades
hlc.gc
java
reg.gc

% install libraries from mercury-extras
$ emerge mercury-extras
<snip>
--- /usr/lib/mercury/modules/java/
<snip>

(Now both the 'mercury' and 'mercury-extras' packages own
/usr/lib/mercury/modules/java/)

% uninstall libraries from mercury-extras
$ emerge -C mercury-extras
<snip>
<<< /usr/lib/mercury/modules/java/
<snip>

$ mmc --output-libgrades
hlc.gc
reg.gc


I can work around the empty-directory problem easily enough by adding
zero-byte files into the empty modules/$grade directories when the
mercury compiler is installed. Then uninstalling any other package that
also owns the modules/$grade directories won't result in these
directories being deleted.


Strictly speaking, these issues are more related to bundling Mercury
with package manager than being inherent to Mercury itself. But both of
these issues make me wonder if there is a more resilient way of
determining which grades the compiler supports.



Thanks

Keri
Zoltan Somogyi
2018-07-18 18:53:06 UTC
Permalink
Post by Keri Harris
I've run into a couple of interesting issues in how Mercury determines
which grades it supports.
The Mercury compiler can generate code for any of its target languages
without any outside help, so I presume you are talking about how
it determines which grades have their libraries installed in the locations
in which it expects, i.e. how it figures what to report when given
--output-libgrades.
Post by Keri Harris
The erlang grade can have problems with this approach since init files
can be installed into the mercury/module/erlang directory. In
particular, a problem can arise when a 3rd-party library installs an
erlang init file, and then Mercury is later reinstalled without support
for the erlang grade; these residual 3rd-party init files can confuse
Mercury into believing that it still does support the erlang grade. The
* compile and install Mercury package such that grade erlang is installed
- /usr/lib/mercury/modules/erlang/mer_std.init is created
* compile and install mercury_fixed library from mercury/extras/fixed
- /usr/lib/mercury/modules/erlang/mercury_fixed.init is created
* recompile and install Mercury such that grade erlang is no longer
installed
- /usr/lib/mercury/modules/erlang/ still exists
Why does the package manager allow the installation of something that is
*already* installed, without *uninstalling* first?

From what I can see, if one installation overwrites another, the results
are inherently fragile. Mercury installs are tested *only* in the context of a fresh,
clean install; using any other kind of install will work only by luck. One could
get around this by expending specific engineering effort to design (and later test)
an install scheme that could survive specific kinds of partial installs, but
in Mercury's case (and in most other cases) I don't think this would be
worthwhile.
Post by Keri Harris
I can work around the empty-directory problem easily enough by adding
zero-byte files into the empty modules/$grade directories when the
mercury compiler is installed. Then uninstalling any other package that
also owns the modules/$grade directories won't result in these
directories being deleted.
That would fix this particular instance of this problem, but not other instances.
I don't think anything would be guaranted to work short of

- checking that every single file that is supposed to be installed is present,
and has the expected cryptographic hash, and
- checking that *only* the expected files are present in the install directories.
Post by Keri Harris
Strictly speaking, these issues are more related to bundling Mercury
with package manager than being inherent to Mercury itself. But both of
these issues make me wonder if there is a more resilient way of
determining which grades the compiler supports.
We could bake into the compiler the list of the grades that were configured
to be installed at the time the compiler executable was itself created.
However, that would still be vulnerable to parts of the install directory
being overwritten later.

Zoltan.
Julien Fischer
2018-07-19 00:54:44 UTC
Permalink
Post by Zoltan Somogyi
Post by Keri Harris
Strictly speaking, these issues are more related to bundling Mercury
with package manager than being inherent to Mercury itself. But both of
these issues make me wonder if there is a more resilient way of
determining which grades the compiler supports.
We could bake into the compiler the list of the grades that were configured
to be installed at the time the compiler executable was itself created.
However, that would still be vulnerable to parts of the install directory
being overwritten later.
I spent a fair amount of time *removing* such "baking in" around a
decade ago; I'm not particularly inclined to add it back. (The idea
back then, was that we could have tool that would compile and install
additional library grades on demand.)

Julien.
Zoltan Somogyi
2018-07-19 01:52:08 UTC
Permalink
Post by Julien Fischer
Post by Zoltan Somogyi
We could bake into the compiler the list of the grades that were configured
to be installed at the time the compiler executable was itself created.
However, that would still be vulnerable to parts of the install directory
being overwritten later.
I spent a fair amount of time *removing* such "baking in" around a
decade ago; I'm not particularly inclined to add it back. (The idea
back then, was that we could have tool that would compile and install
additional library grades on demand.)
For the reason mentioned in my second sentence above, I am not
advocating for it either.

Zoltan.
Keri Harris
2018-07-19 07:00:07 UTC
Permalink
Post by Zoltan Somogyi
Post by Keri Harris
I've run into a couple of interesting issues in how Mercury determines
which grades it supports.
The Mercury compiler can generate code for any of its target languages
without any outside help, so I presume you are talking about how
it determines which grades have their libraries installed in the locations
in which it expects, i.e. how it figures what to report when given
--output-libgrades.
Yes, that's correct.
Post by Zoltan Somogyi
Post by Keri Harris
The erlang grade can have problems with this approach since init files
can be installed into the mercury/module/erlang directory. In
particular, a problem can arise when a 3rd-party library installs an
erlang init file, and then Mercury is later reinstalled without support
for the erlang grade; these residual 3rd-party init files can confuse
Mercury into believing that it still does support the erlang grade. The
* compile and install Mercury package such that grade erlang is installed
- /usr/lib/mercury/modules/erlang/mer_std.init is created
* compile and install mercury_fixed library from mercury/extras/fixed
- /usr/lib/mercury/modules/erlang/mercury_fixed.init is created
* recompile and install Mercury such that grade erlang is no longer
installed
- /usr/lib/mercury/modules/erlang/ still exists
Why does the package manager allow the installation of something that is
*already* installed, without *uninstalling* first?
From what I can see, if one installation overwrites another, the results
are inherently fragile. Mercury installs are tested *only* in the context of a fresh,
clean install; using any other kind of install will work only by luck. One could
get around this by expending specific engineering effort to design (and later test)
an install scheme that could survive specific kinds of partial installs, but
in Mercury's case (and in most other cases) I don't think this would be
worthwhile.
Package managers do uninstall old instances of packages whenever a
package is upgraded/reinstalled. Paradoxically the uninstall phase
always of an old package occurs *after* the install phase of the new
package.

Gentoo Linux is a bit particular in that it is a source distribution, so
in addition to installing the package it must first compile the package.
Installing mercury on Gentoo Linux involves the following steps:

1. compile mercury in a sandbox. If anything during the compile attempts
to write anywhere outside the sandbox, the compile is aborted.

2. install mercury in a sandbox. The install uses DESTDIR so that files
are installed into a safe location still within the sandbox. If the
install attempts to write anywhere outside the sandbox, the compile is
aborted.

(At this stage the old instance of mercury still resides on the live
filesystem and is still fully functional; the new instance of mercury
now resides at DESTDIR in the sandbox).

3. package manager copies mercury installation from sandbox to live
filesystem.

(At this stage the live filesystem contains a mix of both old and new
versions. Many files from the old instance will have been overwritten by
copies from the new instance; some cruft from the old instance may still
remain).

4. package manager deletes any stray files from old instance of package
that have not been overwritten by newer versions. All files within the
sandbox are then deleted.

(At this stage just the new instance of mercury exists on the live
filesystem; all files from the old instance have either been overwritten
in step #3 or deleted in step #4).

Package managers that are not source based (rpm, dpkg etc) essentially
just run steps #2, #3 and #4 where they extract files into a sandbox,
copy them to the live filesystem, and then clean up old files.

Installing the new instance of a package before uninstalling the old
instance is done for a number of reasons:

* there is (obviously) a gap between uninstalling a package and then
reinstalling that package. During this gap files from the package no
longer exist on the filesystem. If you are upgrading an important
package (e.g some system package like coreutils) then uninstalling the
package before the new version is installed will likely leave the system
in a non-functional state.

* if steps #1 or #2 of a package upgrade fail, you still have the old
version of the package in a functional state. (This is particularly
important for source-based distributions since compile-time failures are
possible).

So in the case of upgrading mercury via a package manager (or possibly
just reinstalling mercury to change the supported std library grades)
you're never left with non-directory files from an old instance of
mercury. Thus an install of mercury using a package manager gives you
the same result as if you had performed a clean install. (And behind the
scenes you actually are performing a clean install - into a safe
location in a sandbox).

The problem with the erlang grade is simply that when mercury is
reinstalled without the erlang grade, the package manager may decide not
to delete the mercury/modules/erlang directory since other packages
(e.g. mercury-extras) have placed their own files into this directory.
And since the mere existence of mercury/modules/erlang currently implies
that the erlang std library is installed we can eventually get into
trouble when compiling Mercury code.

I'm in agreement that Mercury shouldn't be expected to re-engineer
anything in order to follow some special install scheme. Handling cruft
from old installs is already part of the package manager's
responsibility. As an aside, the existing Mercury build system is very
good. Two strengths in particular stand out:

1. it uses sane defaults for things (CFLAGS, install locations etc)
2. so many things can be overridden via environment variables or
mmc/mmake arguments. (This is important since each OS or distribution
tends to have their own special policies that need to be followed).
Post by Zoltan Somogyi
Post by Keri Harris
I can work around the empty-directory problem easily enough by adding
zero-byte files into the empty modules/$grade directories when the
mercury compiler is installed. Then uninstalling any other package that
also owns the modules/$grade directories won't result in these
directories being deleted.
That would fix this particular instance of this problem, but not other instances.
I don't think anything would be guaranted to work short of
- checking that every single file that is supposed to be installed is present,
and has the expected cryptographic hash, and
- checking that *only* the expected files are present in the install directories.
Package managers do keep track of which installed packages own which
files, and package managers prevent different packages from attempting
to install the same (non-directory) file. So upgrading or reinstalling
packages tends to go remarkably smoothly. Handling empty directories on
package uninstall is a corner case that some package managers have
trouble with.
Post by Zoltan Somogyi
Post by Keri Harris
Strictly speaking, these issues are more related to bundling Mercury
with package manager than being inherent to Mercury itself. But both of
these issues make me wonder if there is a more resilient way of
determining which grades the compiler supports.
We could bake into the compiler the list of the grades that were configured
to be installed at the time the compiler executable was itself created.
However, that would still be vulnerable to parts of the install directory
being overwritten later.
That would certainly work. Another option would be to follow the same
logic as that used for C grades - look for a uniquely identifiable file
belonging to the std library for each non-C grade.


Thanks

Keri
Julien Fischer
2018-07-19 07:04:59 UTC
Permalink
Hi Keri,
Post by Zoltan Somogyi
Post by Keri Harris
Strictly speaking, these issues are more related to bundling Mercury
with package manager than being inherent to Mercury itself. But both of
these issues make me wonder if there is a more resilient way of
determining which grades the compiler supports.
We could bake into the compiler the list of the grades that were configured
to be installed at the time the compiler executable was itself created.
However, that would still be vulnerable to parts of the install directory
being overwritten later.
That would certainly work. Another option would be to follow the same logic
as that used for C grades - look for a uniquely identifiable file belonging
to the std library for each non-C grade.
Commit 697b677 does that for the erlang grade. I'll fix the
Java and C# grades shortly.

Julien.
Keri Harris
2018-07-19 07:26:13 UTC
Permalink
Post by Julien Fischer
Hi Keri,
Post by Keri Harris
 Strictly speaking, these issues are more related to bundling Mercury
 with package manager than being inherent to Mercury itself. But
both of
 these issues make me wonder if there is a more resilient way of
 determining which grades the compiler supports.
 We could bake into the compiler the list of the grades that were
 configured
 to be installed at the time the compiler executable was itself created.
 However, that would still be vulnerable to parts of the install
directory
 being overwritten later.
That would certainly work. Another option would be to follow the same
logic as that used for C grades - look for a uniquely identifiable
file belonging to the std library for each non-C grade.
Commit 697b677 does that for the erlang grade.  I'll fix the
Java and C# grades shortly.
Thanks for the quick fix for this.


Keri
Julien Fischer
2018-07-23 00:47:01 UTC
Permalink
Post by Julien Fischer
Hi Keri,
Post by Keri Harris
 Strictly speaking, these issues are more related to bundling Mercury
 with package manager than being inherent to Mercury itself. But both
of
 these issues make me wonder if there is a more resilient way of
 determining which grades the compiler supports.
 We could bake into the compiler the list of the grades that were
 configured
 to be installed at the time the compiler executable was itself created.
 However, that would still be vulnerable to parts of the install
directory
 being overwritten later.
That would certainly work. Another option would be to follow the same
logic as that used for C grades - look for a uniquely identifiable file
belonging to the std library for each non-C grade.
Commit 697b677 does that for the erlang grade.  I'll fix the
Java and C# grades shortly.
Commit 05c5e5a fixes it for the Java and C# backends.

Julien.

Julien Fischer
2018-07-19 00:39:41 UTC
Permalink
Hi Keri,
I've run into a couple of interesting issues in how Mercury determines which
grades it supports. For C grades, the compiler checks for the existence of
mercury/modules/$grade/mer_std.c,
mer_std.init.
for non-C grades just the mercury/modules/$grade directory.
Yes.
The erlang grade can have problems with this approach since init files can be
installed into the mercury/module/erlang directory. In particular, a problem
can arise when a 3rd-party library installs an erlang init file, and then
Mercury is later reinstalled without support for the erlang grade; these
residual 3rd-party init files can confuse Mercury into believing that it
still does support the erlang grade. The following sequence of events leads
...

The erlang backend used to not use .init files; since it does now we can
just update the compiler to check for the presence mer_std.init as per
the C grades.
The modules/java and modules/csharp directories seem immune to the above
problem as these grades don't appear to ever create init files. However, even
these grades can cause problems with package managers whenever multiple
packages share the same MERCURY_LIBDIR location. The modules/java and
modules/csharp directories will then be 'owned' by multiple packages, and
package managers will tend to delete empty directories belonging to a package
whenever that package is uninstalled. Thus it's possible for the mercury
compiler to lose track of which grades it supports. For example on Gentoo
For the C# and Java grades, we should look for something more definite;
the presence of mercury/lib/csharp/mer_std.dll and
mercury/lib/java/mer_std.jar for example.

Julien.
Loading...