Discussion:
[m-dev.] Interleaved output in hlc.gc.par
Paul Bone
2015-09-17 06:36:42 UTC
Permalink
While I was testing the boehm_gc changes in hlc.gc.par tl_backjump_test
failed as follows:

***@oxygen> diff -u tests/hard_coded/tl_backjump_test.{exp,out}
--- tests/hard_coded/tl_backjump_test.exp 2014-11-12 12:16:20.420852421 +1100
+++ tests/hard_coded/tl_backjump_test.out 2015-09-17 14:58:47.463777533 +1000
@@ -20,9 +20,8 @@
(TID: #2) solution 2, 2, 2
(TID: #2) label C = 3, (14)
(TID: #2) Solutions:
-(TID: #2) {1, 1, 3},
-(TID: #2) {2, 1, 1},
-(TID: #2) {2, 2, 2}
+(TID: #2) (TID: #3) label C = 3, (14)
+(TID: #2) (TID: #3) {2{, 21, , 1},
(TID: #3) label A = 1, (0)
(TID: #3) label B = 1, (1)
(TID: #3) label C = 1, (2)
@@ -43,8 +42,6 @@
(TID: #3) label C = 1, (12)
(TID: #3) label C = 2, (13)
(TID: #3) solution 2, 2, 2
-(TID: #3) label C = 3, (14)
(TID: #3) Solutions:
-(TID: #3) {1, 1, 3},
-(TID: #3) {2, 1, 1},
+(TID: #3) {{11, , 11, , 33}},
(TID: #3) {2, 2, 2}

As this is a multithreaded test the test suite sorts output lines so that
interleaving does not affect the result of the test. However the output of the
threads is interleaved _within_ lines.

The test case uses io.format to print its results and I believe that the
compiler is optimising io.format into multiple calls to io.write_string (and
others) and these calls are being interleaved between the threads.

Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.

hello(!IO) :-
io.write_string("Hello ", !IO),
io.write_string("world!", !IO).

We generally don't expect hello/2 to be atomic with respect to IO. (THis is
the semantics of thread.spawn, not pararallel conjunctions) But by
expecting this of io.format we create a double standard. Is this okay?

If so we can update the io.format optimisation to lock and unlock the output
stream within the IO state to prevent interleaving like this.

Cheers.
--
Paul Bone
Peter Wang
2015-09-17 07:00:27 UTC
Permalink
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"

Peter
Paul Bone
2015-09-17 07:01:59 UTC
Permalink
Post by Peter Wang
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"
Oh,

1) How can users control this for cases when they need to?
2) Why are we writing character by character?!?!
--
Paul Bone
Zoltan Somogyi
2015-09-17 07:32:24 UTC
Permalink
Post by Paul Bone
Post by Peter Wang
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"
Oh,
1) How can users control this for cases when they need to?
2) Why are we writing character by character?!?!
"We" are not writing anything. Due to this line in library/string.format.m:

#define ML_USE_SPRINTF MR_TRUE

calls to io.format are implemented using sprintf. All the threads use
the same buffer, so it is lucky even that you get all the characters you printed,
without a character printed by one thread overwriting a character printed
by another thread.

To control it, we (the implementors, not Mercury users) would need to surround
all the C code fragments we invoke that may affect stdio buffers with the *same*
lock. I believe we already surround such pieces of code with locks, but these locks
guard only against two threads executing the same *piece of code*, not two
different pieces of code modifying the same *piece of data*.

This would impose serialization of I/O even in cases where different threads
write to different files, which itself is undesirable. The only good solution
I can think of is to attach the lock to a Mercury-controlled handle for the
file pointer.

Zoltan.

Zoltan.
Paul Bone
2015-09-17 08:09:18 UTC
Permalink
Post by Zoltan Somogyi
Post by Paul Bone
Post by Peter Wang
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"
Oh,
1) How can users control this for cases when they need to?
2) Why are we writing character by character?!?!
#define ML_USE_SPRINTF MR_TRUE
calls to io.format are implemented using sprintf. All the threads use
the same buffer, so it is lucky even that you get all the characters you printed,
without a character printed by one thread overwriting a character printed
by another thread.
To control it, we (the implementors, not Mercury users) would need to surround
all the C code fragments we invoke that may affect stdio buffers with the *same*
lock. I believe we already surround such pieces of code with locks, but these locks
guard only against two threads executing the same *piece of code*, not two
different pieces of code modifying the same *piece of data*.
I don't know about sprintf, but I believe that printf and fprintf are thread
safe, or maybe that's just glibc..

I just checked runtime/mercury_string.c and MR_make_string which is where
the call to vsnprintf or vsprintf is done (called from string.format.m).
The buffer is either allocated on the stack or the heap (depending on it's
size requirements) and never statically allocated. The procedure is thread
safe and reentrant.

So this formats the string using a thread-safe buffer, then write_string
writes it out, this eventually calls vfprintf to write the string.
Ultimately the thread-safety of this depends on libc, and I believe that
with glibc this is thread safe (but obviously shouldn't be relied on).

Therefore I believe that the interleaving in the test case's output is due
to the optimisations applied to calls to format to break them up into
separate write operations.
Post by Zoltan Somogyi
This would impose serialization of I/O even in cases where different threads
write to different files, which itself is undesirable. The only good solution
I can think of is to attach the lock to a Mercury-controlled handle for the
file pointer.
If necessary that would be best.
--
Paul Bone
Zoltan Somogyi
2015-09-17 09:49:21 UTC
Permalink
Post by Paul Bone
I don't know about sprintf, but I believe that printf and fprintf are thread
safe, or maybe that's just glibc..
That is not something you want to just "believe".
Post by Paul Bone
I just checked runtime/mercury_string.c and MR_make_string which is where
the call to vsnprintf or vsprintf is done (called from string.format.m).
The buffer is either allocated on the stack or the heap (depending on it's
size requirements) and never statically allocated. The procedure is thread
safe and reentrant.
So this formats the string using a thread-safe buffer, then write_string
writes it out, this eventually calls vfprintf to write the string.
That is NOT the buffer I am talking about. I am talking about the buffer
that stdio associates with every file descriptor that is either opened
by stdio itself using fopen or its relatives, or is one of stdin/out/err,
for which stdio creates a buffer at startup.
Post by Paul Bone
Ultimately the thread-safety of this depends on libc, and I believe that
with glibc this is thread safe (but obviously shouldn't be relied on).
A brief google search told me that this is VERY OS-dependent.

Zoltan.
Peter Wang
2015-09-17 07:55:47 UTC
Permalink
Post by Paul Bone
Post by Peter Wang
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"
Oh,
1) How can users control this for cases when they need to?
Use one of the synchronisation primitives, e.g. thread.semaphore
Post by Paul Bone
2) Why are we writing character by character?!?!
We call vfprintf in C, and we can do because there happens to be no
difference between the internal string representation and the external
representation (for now). Apparently POSIX requires that by default
stream operations are atomic so, on some platforms, for some operations,
we would get that atomicity "for free".

It was a slightly bad example, but if we look at the C# implementation
it can write character by character to perform newline translation.

Peter
Paul Bone
2015-09-17 08:14:20 UTC
Permalink
Post by Peter Wang
Post by Paul Bone
Post by Peter Wang
Post by Paul Bone
Should a user reasonably expect calls to io.format to be atomic with respect
to the IO state? I ask because this creates an inconsistency.
I don't think so. It is the user's responsibility to serialise any
concurrent uses of the stream. It's not specific to io.format.
Two threads calling write_string may produce output with interleaved
characters, e.g. "Hweolrllod !"
Oh,
1) How can users control this for cases when they need to?
Use one of the synchronisation primitives, e.g. thread.semaphore
Post by Paul Bone
2) Why are we writing character by character?!?!
We call vfprintf in C, and we can do because there happens to be no
difference between the internal string representation and the external
representation (for now). Apparently POSIX requires that by default
stream operations are atomic so, on some platforms, for some operations,
we would get that atomicity "for free".
Oh cool, I didn't get this from reading the man pages. Although my search
of the pthreads man-page was rather quick.
Post by Peter Wang
It was a slightly bad example, but if we look at the C# implementation
it can write character by character to perform newline translation.
I don't mind character by character when you write into a buffer. For a
moment I was worried that the calls to the c library where character by
character (I know libc does buffering, but still).

Anyway, now that I've checked Mercury is not writing character by character.
So how can you get interleaved output such as "Hweolrllod !"?
--
Paul Bone
Julien Fischer
2015-09-17 13:53:04 UTC
Permalink
Post by Paul Bone
While I was testing the boehm_gc changes in hlc.gc.par tl_backjump_test
--- tests/hard_coded/tl_backjump_test.exp 2014-11-12 12:16:20.420852421 +1100
+++ tests/hard_coded/tl_backjump_test.out 2015-09-17 14:58:47.463777533 +1000
@@ -20,9 +20,8 @@
(TID: #2) solution 2, 2, 2
(TID: #2) label C = 3, (14)
-(TID: #2) {1, 1, 3},
-(TID: #2) {2, 1, 1},
-(TID: #2) {2, 2, 2}
+(TID: #2) (TID: #3) label C = 3, (14)
+(TID: #2) (TID: #3) {2{, 21, , 1},
(TID: #3) label A = 1, (0)
(TID: #3) label B = 1, (1)
(TID: #3) label C = 1, (2)
@@ -43,8 +42,6 @@
(TID: #3) label C = 1, (12)
(TID: #3) label C = 2, (13)
(TID: #3) solution 2, 2, 2
-(TID: #3) label C = 3, (14)
-(TID: #3) {1, 1, 3},
-(TID: #3) {2, 1, 1},
+(TID: #3) {{11, , 11, , 33}},
(TID: #3) {2, 2, 2}
As this is a multithreaded test the test suite sorts output lines so that
interleaving does not affect the result of the test. However the output of the
threads is interleaved _within_ lines.
The reasons for this have been discussed in the rest of this thread.
If you've ever tried at the the concurrency examples in the hlc.par.gc
grade you'll notice that they (sometimes) do the same thing.

I think the best thing to do here would be to re-write the test so
run_problem and its descendents don't write anything to I/O streams but
instead build up a list of strings to print inside io_mutvars that are
passed to each thread and then extract and print those lists of strings
*after* all the worker threads have finished running.

Julien.

Loading...