The C# Memory Model in Theory and Practice - .NET

17.4.3 Volatile fields
When a field-declaration includes a volatile modifier, the fields introduced by that declaration are
volatile fields. For non-volatile fields, optimization techniques that reorder instructions can lead to
unexpected and unpredictable results in multi-threaded programs that access fields without synchronization
such as that provided by the lock-statement (§15.12). These optimizations can be performed by the compiler,
by the runtime system, or by hardware. For volatile fields, such reordering optimizations are restricted:

• A read of a volatile field is called a volatile read. A volatile read has “acquire semantics”; that is, it is
guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.

• A write of a volatile field is called a volatile write. A volatile write has “release semantics”; that is, it is
guaranteed to happen after any memory references prior to the write instruction in the instruction
sequence.

В шарпе VolatileRead() & VolatileWrite() реализуются вот так:



        [MethodImplAttribute(MethodImplOptions.NoInlining)] // disable optimizations
        public static int VolatileRead(ref int address)
        { 
            int ret = address;
            MemoryBarrier(); // Call MemoryBarrier to ensure the proper semantic in a portable way. 
            return ret; 
        }

        [MethodImplAttribute(MethodImplOptions.NoInlining)] // disable optimizations
        public static void VolatileWrite(ref int address, int value)
        {
            MemoryBarrier(); // Call MemoryBarrier to ensure the proper semantic in a portable way. 
            address = value;
        } 
 
        [MethodImplAttribute(MethodImplOptions.NoInlining)] // disable optimizations
        public static void VolatileWrite(ref long address, long value) 
        {
            MemoryBarrier(); // Call MemoryBarrier to ensure the proper semantic in a portable way.
            address = value;
        }

т.е. либо спецификация врёт, либо в мелкософте решили ей не следовать, из чего можно заключить, что статья написанная по спецификации бесполезна.

Здравствуйте, drol, Вы писали:

[]

общие слова. покажи мне место в спеке, где говорится, что вызывающий код может получить ссылку на объект, если он до конца не
сконструирован, без хаков вроде передачи this.

именно об этом говорит автор. никакие reorderings и потоки тут вообще ни при чем.

еще раз копирую пример для наглядности

To further emphasize the point, consider this class:
class BoxedInt2
{
  public readonly int _value = 42;
  void PrintValue()
  {
    Console.WriteLine(_value);
  }
}

Now, it’s possible—at least in theory—that PrintValue will print “0” due to a memory-model issue. Here’s a usage example of BoxedInt that allows it:
class Tester
{
  BoxedInt2 _box = null;
  public void Set() {
    _box = new BoxedInt2();//*1
  }
  public void Print() {
    var b = _box;
    if (b != null) b.PrintValue();
  }
}

Because the BoxedInt instance was incorrectly published (through a non-volatile field, _box), the thread that calls Print may observe a partially constructed object! Again, making the _box field volatile would fix the issue.

в *1 ты не можешь получить ссылку на объект, пока не отработал конструктор. и потоки тут совершенно ни при чем.
поэтому там должен быть либо _box._value == 42 либо _box == null.

все остальное — ересь.

Здравствуйте, Философ, Вы писали:

Ф>т.е. это касается самого компилятора, но в мультитрединге есть ещё одно действующее лицо — Процессор,

Это касается всей системы исполнения. Разумеется, включая процессор.

Ф>В дизасме нет ничего, что бы хоть как-то гарантировало порядок выполнения.

Конечно же есть. Архитектура Intel 64\IA-32 называется. Операции с памятью в оных почти полностью соответствуют определению volatile-полей спецификации C#\ECMA, и в дополнительной обвязке на тему нуждаются крайне редко.

Здравствуйте, Константин Л., Вы писали:

КЛ>именно об этом говорит автор. никакие reorderings и потоки тут вообще ни при чем.

Это Вы просто читать не умеете.

Обсуждаемый код относится к разделу статьи Thread Communication Patterns. И в первом же абзаце оного написано о чём идёт речь:

The purpose of a memory model is to enable thread communication. When one thread writes values to memory and another thread reads from memory, the memory model dictates what values the reading thread might see.

Далее эти аспекты уточняются относительно конкретных patterns. И в частности для Publication via Volatile Field:

has two methods, Init and Print; both may be called from multiple threads. If no memory operations are reordered, Print can only print “Not initialized” or “42,” but there are two possible cases when Print could print a “0”

Обсуждаемый же код даётся как предельный пример Publication via Volatile Field:

To further emphasize the point, consider this class:
class BoxedInt2
{
  public readonly int _value = 42;
...
}

КЛ>в *1 ты не можешь получить ссылку на объект, пока не отработал конструктор. и потоки тут совершенно ни при чем.

Ссылку на недоконструированный объект получает не поток вызвавший конструктор, а какой-то другой поток, который дёргает Print() в период пока в первом идёт отработка Set():

he thread that calls Print may observe a partially constructed object

Здравствуйте, drol, Вы писали:

D>Здравствуйте, Философ, Вы писали:

Ф>>т.е. это касается самого компилятора, но в мультитрединге есть ещё одно действующее лицо — Процессор,

D>Это касается всей системы исполнения. Разумеется, включая процессор.

Ф>>В дизасме нет ничего, что бы хоть как-то гарантировало порядок выполнения.

D>Конечно же есть. Архитектура Intel 64\IA-32 называется. Операции с памятью в оных почти полностью соответствуют определению volatile-полей спецификации C#\ECMA, и в дополнительной обвязке на тему нуждаются крайне редко.

блин,....

Intel® 64 and IA-32 Architectures Software Developer’s Manual

http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

8.2.1 Memory Ordering in the Intel® Pentium® and Intel486™ Processors

The Pentium and Intel486 processors follow the processor-ordered memory model; however, they operate as
strongly-ordered processors under most circumstances. Reads and writes always appear in programmed order at
the system bus—except for the following situation where processor ordering is exhibited. Read misses are
permitted to go ahead of buffered writes on the system bus when all the buffered writes are cache hits and, therefore,
are not directed to the same address being accessed by the read miss.
In the case of I/O operations, both reads and writes always appear in programmed order.
Software intended to operate correctly in processor-ordered processors (such as the Pentium 4, Intel Xeon, and P6
family processors) should not depend on the relatively strong ordering of the Pentium or Intel486 processors.
Instead, it should ensure that accesses to shared variables that are intended to control concurrent execution
among processors are explicitly required to obey program ordering through the use of appropriate locking or serializing
operations (see Section 8.2.5, “Strengthening or Weakening the Memory-Ordering Model”).

8.2.2 Memory Ordering in P6 and More Recent Processor Families

The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, and P6 family processors also use a processor-ordered
memory-ordering model that can be further defined as “write ordered with store-buffer forwarding.” This model
can be characterized as follows.
In a single-processor system for memory regions defined as write-back cacheable, the memory-ordering model
respects the following principles (Note the memory-ordering principles for single-processor and multipleprocessor
systems are written from the perspective of software executing on the processor, where the term
“processor” refers to a logical processor. For example, a physical processor supporting multiple cores and/or
HyperThreading Technology is treated as a multi-processor systems.):
• Reads are not reordered with other reads.
• Writes are not reordered with older reads.
• Writes to memory are not reordered with other writes, with the following exceptions:
— writes executed with the CLFLUSH instruction;
— streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ,
MOVNTDQ, MOVNTPS, and MOVNTPD); and
— string operations (see Section 8.2.4.1).
• Reads may be reordered with older writes to different locations but not with older writes to the same location.
• Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions.
• Reads cannot pass earlier LFENCE and MFENCE instructions.
• Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.
• LFENCE instructions cannot pass earlier reads.
• SFENCE instructions cannot pass earlier writes.
• MFENCE instructions cannot pass earlier reads or writes.
In a multiple-processor system, the following ordering principles apply:
• Individual processors use the same ordering principles as in a single-processor system.
• Writes by a single processor are observed in the same order by all processors.
• Writes from an individual processor are NOT ordered with respect to the writes from other processors.
• Memory ordering obeys causality (memory ordering respects transitive visibility).
• Any two stores are seen in a consistent order by processors other than those performing the stores

11.4.4.3 Memory Ordering Instructions

SSE2 extensions introduce two new fence instructions (LFENCE and MFENCE) as companions to the SFENCE
instruction introduced with SSE extensions.
The LFENCE instruction establishes a memory fence for loads. It guarantees ordering between two loads and
prevents speculative loads from passing the load fence (that is, no speculative loads are allowed until all loads
specified before the load fence have been carried out).
The MFENCE instruction combines the functions of LFENCE and SFENCE by establishing a memory fence for both
loads and stores. It guarantees that all loads and stores specified before the fence are globally observable prior to
any loads or stores being carried out after the fence.

11.6.13 Cacheability Hint Instructions
....
The degree to which a consumer of data knows that the data is weakly ordered can vary for these cases. As a
result, the SFENCE or MFENCE instruction should be used to ensure ordering between routines that produce
weakly-ordered data and routines that consume the data. SFENCE and MFENCE provide a performance-efficient
way to ensure ordering by guaranteeing that every store instruction that precedes SFENCE/MFENCE in program
order is globally visible before a store instruction that follows the fence.

12.10.3 Streaming Load Hint Instruction

Streaming loads may be weakly ordered and may appear to software to execute out of order with respect to
other memory operations. Software must explicitly use fences (e.g. MFENCE) if it needs to preserve order
among streaming loads or between streaming loads and other memory operations.

MFENCE—Memory Fence

Processors are free to fetch and cache data speculatively from regions of system memory that use the WB, WC, and
WT memory types. This speculative fetching can occur at any time and is not tied to instruction execution. Thus, it
is not ordered with respect to executions of the MFENCE instruction; data can be brought into the caches speculatively
just before, during, or after the execution of an MFENCE instruction.
This instruction’s operation is the same in non-64-bit modes and 64-bit mode.

SFENCE—Store Fence

Description
Performs a serializing operation on all store-to-memory instructions that were issued prior the SFENCE instruction.
This serializing operation guarantees that every store instruction that precedes the SFENCE instruction in program
order becomes globally visible before any store instruction that follows the SFENCE instruction. The SFENCE
instruction is ordered with respect to store instructions, other SFENCE instructions, any LFENCE and MFENCE
instructions, and any serializing instructions (such as the CPUID instruction). It is not ordered with respect to load
instructions.

Здравствуйте, drol, Вы писали:

D>Здравствуйте, Константин Л., Вы писали:

КЛ>>именно об этом говорит автор. никакие reorderings и потоки тут вообще ни при чем.

D>Это Вы просто читать не умеете.

[хамство skipped]

D>Ссылку на недоконструированный объект получает не поток вызвавший конструктор, а какой-то другой поток, который дёргает Print() в период пока в первом идёт отработка Set():

he thread that calls Print may observe a partially constructed object

еще раз прошу ссылку на место в спеке, где написано, что такая ситуация возможна.
еще раз — ни при каких стандартных обстоятельствах ты не можешь получить ссылку на объект, который не создан до конца.
new X() — синхронна и атомарна с точки зрения caller'а и это не должно зависеть ни от каких моделей памяти

Здравствуйте, Философ, Вы писали:

Ф>блин,....

Выделенные Вами фрагменты относятся к командам типа MOVNT* и их "соратникам". Ну и где они в Вашем ассемблерном листинге ???

Здравствуйте, Константин Л., Вы писали:

D>>Ссылку на недоконструированный объект получает не поток вызвавший конструктор, а какой-то другой поток, который дёргает Print() в период пока в первом идёт отработка Set():

he thread that calls Print may observe a partially constructed object

КЛ>прошу ссылку на место в спеке, где написано, что такая ситуация возможна.

В этом моменте я Вам ничем помочь не могу. Спецификация не занимается перечислением возможных вариантов устройства потрохов исполнительной системы. Она всего лишь определяет правила\требования, которым должна удовлетворять любая реализация оной. И про конструкторы там ничего не написано. Зато написано следующее:

Conforming implementations of the CLI are free to execute programs using any technology that guarantees, within a single thread of execution, that side-effects and exceptions generated by a thread are visible in the order specified by the CIL. For this purpose only volatile operations (including volatile reads) constitute visible side-effects.

...

An optimizing compiler from CIL to native code is permitted to reorder code, provided that it guarantees both the single-thread semantics described in §12.6 and the cross-thread semantics of volatile operations.

КЛ>еще раз — ни при каких стандартных обстоятельствах ты не можешь получить ссылку на объект, который не создан до конца.

Это неправда.

КЛ>new X() — синхронна и атомарна с точки зрения caller'а и это не должно зависеть ни от каких моделей памяти

Тоже ерунда. "Синхронность" и "атомарность" есть понятия concurrency, и сочетание с "с точки зрения caller'а" в этом плане не имеет смысла.

Здравствуйте, drol, Вы писали:

[]

спецификация c# language spec обязана этим заниматься. другие меня пока не волнуют

Здравствуйте, Константин Л., Вы писали:

КЛ>спецификация c# language spec обязана этим заниматься

Ну раз Вы так считаете, то почему я не вижу в Вашем постинге соответствующие цитаты из оной спецификации на тему ?

Здравствуйте, Nikolay_Ch, Вы писали:

А вот и русский вариант вышел (хотя не знаю, когда, может это уже и не новость): Модель памяти C# в теории и на практике.

Здравствуйте, SergeyT., Вы писали:

ST>А вот и русский вариант вышел (хотя не знаю, когда, может это уже и не новость): Модель памяти C# в теории и на практике.
Перевод как всегда отвратителен. Редактор русского издания зря получает свои деньги: пропускать ляпы типа двойного повторения абзацев — позор.

"Хотя абстрактная модель памяти C# — это то, что вы учитывать при написании нового кода" — позор.
"как и почему эти трансформации происходят на практике, когда мы будет подробно рассматривать" — позор.
В общем, не читайте советских газет. Даже от русского автора оригинал на английском гораздо полезнее.

От:	Философ	http://vk.com/id10256428
Дата:	10.12.12 07:02
Оценка:	-1

	От:	Константин Л.
	Дата:	10.12.12 13:12
	Оценка:	-1

	От:	drol
	Дата:	10.12.12 14:11
	Оценка:

	От:	drol
	Дата:	10.12.12 14:50
	Оценка:

От:	Философ	http://vk.com/id10256428
Дата:	10.12.12 14:57
Оценка:	-1

От:	SergeyT.	http://sergeyteplyakov.blogspot.com/
Дата:	22.01.13 02:49
Оценка:

От:	Sinclair	https://github.com/evilguest/
Дата:	24.01.13 15:17
Оценка:	+4