john skaller writes:
Similar problems, but compiler optimizations are more likely to cause
problems because they can do so much more. CPU and cache management will
not eliminate “unnecessary” code or execute code speculatively.
Oh, but they do (execute code speculatively). In fact all modern
intel CPU’s do this.
Yes, unfortunate choice of words there
The guarantees provided by intel CPUs when they reorder your code is far
stronger than the guarantees of the C/C++ standards, though.
It’s unlikely a compiler will do it because compilers can really
only schedule a single thread of control. All they do is try
to help the CPU do it.
It is not only likely, it is absolutely guaranteed that modern compilers
will both eliminate unnecessary code and execute code speculatively.
That is, a compiler will calculate values just in case they may be
needed. And that means they will reorder the code in such a way that
code that should not be reachable will still be executed. So code that
is written like:
if (a == 0)
can well be rewritten as:
b_type tmp = b;
if (a == 0)
if the compiler’s analysis shows that this is likely to typically be
faster. (And it doesn’t break the guarantees of the language, of
“Modern” compilers are still quite stupid, at least in part
because they’re compiling a language which appears to have
been designed to defeat optimisation (namely, C).
C does have features that make certain classes of optimization harder
(pointers, in particular). Some of those problems can be mitigated (e.g.
by using “restrict” in the places where the compiler needs that
[People doing high performance numerical work still use Fortran …]
True, though maybe as much from tradition as from actual advantages
But yes, classic fortran has some restrictions that are useful for these
A correct implementation of posix mutexes does work, since they are
specified to do all the right things (mutual exclusion and full memory
I actually checked the specs and saw no mention of a memory barrier,
do you happen to have a link?
“synchronize thread execution and also synchronize memory with respect
to other threads.”
In practice, that just means that compilers have to recognize the
threading code and disable any optimizations around that code that would
break it. And I expect that’s exactly what they do.
Certainly not for languages like C. They don’t “recognise” anything.
C compilers are extremely dumb. They can barely optimise
basic primitives like memset … and when they do they break
all sorts of security code (clearing passwords out of memory …)
Yes, even not-very-modern C compilers recognize common patterns and
optimize them specifically. Because that is really effective.
For important constructs that are known to be broken unless the compiler
helps out, you can be sure the compiler authors will detect those cases
and protect them where necessary.
memset is also a library function and not a basic primitive. And as you
say, compilers do recognize them and optimize them. Or how about this
little surprise: What do you think this code compiles to (using current
Turns out the final binary doesn’t call printf at all. It is instead
I discovered this when my breakpoint on printf never triggered
Though, on second thoughts, I expect what actually happens with the
threading functions is that compiler-specific barriers have been added
to the code to ensure the tested compilers are unable to optimize the
code to breaking.
Of course, the situation is much better with C++11, which does provide
the necessary primitives (and memory model) to write multi-threaded
The situation is better with C++ because it provides much higher
level constructs and a stronger type system, as well as specifically
supporting threads. In addition the design is deliberate and modern
(although it still has to work in a framework which is poorly structured).
C++11 in particular. Older versions do not have language support for
threads, and so need to deal with the problems of libraries supporting
eirik> On 04/03/2015, at 2:44 AM, Eirik Byrkjeflot Anonsen wrote: