Thread: RE: Assuming that TAS() will succeed the first time is verboten

RE: Assuming that TAS() will succeed the first time is verboten

From

"Mikheev, Vadim"

Date:

29 December 2000, 18:20:11

> > Actually, one slocks are held
> > longer than anothers - probably we should use different delays...
> 
> I don't understand the last remark.  Are you proposing to mix some 
> random numbers into the delays?

No, use different s_spincycle-s for different slocks.

Vadim

Re: Assuming that TAS() will succeed the first time is verboten

From

Tom Lane

Date:

08 January 2001, 17:00:32

One last followup on that bizarreness about shutdown's checkpoint
failing on Alpha platforms ---

After changing the checkpoint code to loop, rather than assuming TAS()
must succeed the first time, I noticed that it always looped exactly
once.  This didn't make sense to me at the time, but after querying some
Alpha experts at DEC^H^H^HCompaq, it does now.  If a new process's first
write to a shared memory page is a stq_c, that stq_c is guaranteed to
fail (at least on Tru64 Unix), because it will page fault.  The shared
memory page is inherited read-only and is converted to read-write on
first fault.  This doesn't seem really necessary, but I suppose it's
done to share code with the copy-on-write case for non-shared pages
that are inherited via fork().

It makes sense that the checkpoint process's first write to shared
memory would be stq_c, because after all it shouldn't be scribbling
on shared memory until it's got the spinlock, n'est ce pas?

So a failure the first time through the TAS loop is entirely expected
for Alpha.  I wouldn't be surprised to see similar behavior on other
architectures, now that I see the first-write-from-a-process connection.

Bottom line is the same: always call TAS() in a retry loop.
        regards, tom lane

Re: Assuming that TAS() will succeed the first time is verboten

From

Bruce Momjian

Date:

09 January 2001, 01:15:42

> One last followup on that bizarreness about shutdown's checkpoint
> failing on Alpha platforms ---
> 
> After changing the checkpoint code to loop, rather than assuming TAS()
> must succeed the first time, I noticed that it always looped exactly
> once.  This didn't make sense to me at the time, but after querying some
> Alpha experts at DEC^H^H^HCompaq, it does now.  If a new process's first
> write to a shared memory page is a stq_c, that stq_c is guaranteed to
> fail (at least on Tru64 Unix), because it will page fault.  The shared
> memory page is inherited read-only and is converted to read-write on
> first fault.  This doesn't seem really necessary, but I suppose it's
> done to share code with the copy-on-write case for non-shared pages
> that are inherited via fork().

This seems quite bizarre.  Why would the process fail on the write, and
not just pause and wait for the fault to bring in the page?  Doesn't the
CPU halt the instruction to fetch in the page and restart the
instruction?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Assuming that TAS() will succeed the first time is verboten

From

ncm@zembu.com (Nathan Myers)

Date:

09 January 2001, 01:41:52

On Mon, Jan 08, 2001 at 10:15:30PM -0500, Bruce Momjian wrote:
> > One last followup on that bizarreness about shutdown's checkpoint
> > failing on Alpha platforms ---
> > 
> > After changing the checkpoint code to loop, rather than assuming TAS()
> > must succeed the first time, I noticed that it always looped exactly
> > once.  This didn't make sense to me at the time, but after querying some
> > Alpha experts at DEC^H^H^HCompaq, it does now.  If a new process's first
> > write to a shared memory page is a stq_c, that stq_c is guaranteed to
> > fail (at least on Tru64 Unix), because it will page fault.  The shared
> > memory page is inherited read-only and is converted to read-write on
> > first fault.  This doesn't seem really necessary, but I suppose it's
> > done to share code with the copy-on-write case for non-shared pages
> > that are inherited via fork().
> 
> This seems quite bizarre.  Why would the process fail on the write, and
> not just pause and wait for the fault to bring in the page?  Doesn't the
> CPU halt the instruction to fetch in the page and restart the
> instruction?

This is normal, although non-intuitive.  (Good detective work, Tom.)  
The definition of load-locked/store-conditional says that if there's 
been an interrupt or trap (e.g. page fault) since the load-locked 
instruction executed, the store-conditional instruction fails.  That 
way you don't overwrite something that might have been written by 
another process that ran during the interval before you got the CPU
again.

Thus, the instruction does get restarted, but the lock has been 
(correctly) cleared, resulting in the need for failure/retry.  It's 
not a performance issue, because it only happens once per process.
Think of it as part of the cost of forking.

Nathan Myers
ncm@zembu.com

Re: Assuming that TAS() will succeed the first time is verboten

From

Tom Lane

Date:

09 January 2001, 02:03:53

Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> After changing the checkpoint code to loop, rather than assuming TAS()
>> must succeed the first time, I noticed that it always looped exactly
>> once.  This didn't make sense to me at the time, but after querying some
>> Alpha experts at DEC^H^H^HCompaq, it does now.  If a new process's first
>> write to a shared memory page is a stq_c, that stq_c is guaranteed to
>> fail (at least on Tru64 Unix), because it will page fault.  The shared
>> memory page is inherited read-only and is converted to read-write on
>> first fault.  This doesn't seem really necessary, but I suppose it's
>> done to share code with the copy-on-write case for non-shared pages
>> that are inherited via fork().

> This seems quite bizarre.  Why would the process fail on the write, and
> not just pause and wait for the fault to bring in the page?

An ordinary write would be re-executed and would succeed after the
page fault.  stq_c is different, because it's only supposed to succeed
if the processor has managed to hold an access lock on the target
address continuously since the ldq_l.  It would be very bad form to try
to hold the lock during a page fault.  (stq_c will also fail if the
processor is interrupted between ldq_l and stq_c, so occasional failures
are to be expected.  What was surprising me was the consistency of the
failure pattern.)

See the Alpha Architecture Manual if you really want to discuss this.
        regards, tom lane

Re: Assuming that TAS() will succeed the first time is verboten

From

Bruce Momjian

Date:

09 January 2001, 02:17:17

Oh, thanks.  That makes sense.

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> After changing the checkpoint code to loop, rather than assuming TAS()
> >> must succeed the first time, I noticed that it always looped exactly
> >> once.  This didn't make sense to me at the time, but after querying some
> >> Alpha experts at DEC^H^H^HCompaq, it does now.  If a new process's first
> >> write to a shared memory page is a stq_c, that stq_c is guaranteed to
> >> fail (at least on Tru64 Unix), because it will page fault.  The shared
> >> memory page is inherited read-only and is converted to read-write on
> >> first fault.  This doesn't seem really necessary, but I suppose it's
> >> done to share code with the copy-on-write case for non-shared pages
> >> that are inherited via fork().
> 
> > This seems quite bizarre.  Why would the process fail on the write, and
> > not just pause and wait for the fault to bring in the page?
> 
> An ordinary write would be re-executed and would succeed after the
> page fault.  stq_c is different, because it's only supposed to succeed
> if the processor has managed to hold an access lock on the target
> address continuously since the ldq_l.  It would be very bad form to try
> to hold the lock during a page fault.  (stq_c will also fail if the
> processor is interrupted between ldq_l and stq_c, so occasional failures
> are to be expected.  What was surprising me was the consistency of the
> failure pattern.)
> 
> See the Alpha Architecture Manual if you really want to discuss this.
> 
>             regards, tom lane
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026