Thread: Small SSI issues

Small SSI issues

From

Heikki Linnakangas

Date:

10 June 2011, 07:59:24

It makes wonders to take a couple of months break from looking at a 
piece of code, and then review it in detail again. It's like a whole new 
pair of eyes :-).

Here's a bunch of small issues that I spotted:

* The oldserxid code is broken for non-default BLCKSZ.  o The warning will come either too early or too late  o There
isno safeguard against actually wrapping around the SLRU,    just the warning  o I'm suspicious of the
OldSerXidPagePrecedesLogically()logic with    32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger than
necessaryto cover the whole range of 2^32 transactions, so at high    XIDs, say 2^32-1, doesn't it incorrectly think
thatlow XIDs, say    10, are in the past, not the future?
 
  We've discussed these SLRU issues already, but still..

> /*
>  * Keep a pointer to the currently-running serializable transaction (if any)
>  * for quick reference.
>  * TODO SSI: Remove volatile qualifier and the then-unnecessary casts?
>  */
> static volatile SERIALIZABLEXACT *MySerializableXact = InvalidSerializableXact;

* So, should we remove it? In most places where contents of  MySerializableXact are modified, we're holding
SerializableXactHashLockin exclusive mode. However, there's two  exceptions:
 
  o GetSafeSnapshot() modifies MySerializableXact->flags without any    lock. It also inspects
MySerializableXact->possibleUnsafeConflicts   without a lock. What if somone sets some other flag in the flags
bitmapjust when GetSafeSnapshot() is about to set    SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost
:-(.
  o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE without    holding a lock. The same danger is here if
someoneelse tries to    set some other flag concurrently.
 
  I think we should simply acquire SerializableXactHashLock in  GetSafeSnapshot(). It shouldn't be so much of a hotspot
thatit would  make any difference in performance. CheckForSerializableConflictIn()  might be called a lot, however, so
maybewe need to check if the flag  is set first, and only acquire the lock and set it if it's not set  already.
 

* Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in  ReleasePredicateLocks() for a fleeting moment while
thefunction  releases all conflicts and locks held by the transaction, and finally  the sxact struct itself containing
theflag. Also, isn't a  transaction that's already been marked for death the same as one that  has already rolled back,
forthe purposes of detecting conflicts?
 

* The HavePartialClearThrough/CanPartialClearThrough mechanism needs a  better explanation. The only explanation
currentlyis this:
 

>     if (--(PredXact->WritableSxactCount) == 0)
>     {
>         /*
>          * Release predicate locks and rw-conflicts in for all committed
>          * transactions.  There are no longer any transactions which might
>          * conflict with the locks and no chance for new transactions to
>          * overlap.  Similarly, existing conflicts in can't cause pivots,
>          * and any conflicts in which could have completed a dangerous
>          * structure would already have caused a rollback, so any
>          * remaining ones must be benign.
>          */
>         PredXact->CanPartialClearThrough = PredXact->LastSxactCommitSeqNo;
>     }
  If I understand that correctly, any predicate lock belonging to any  already committed transaction can be safely
zappedaway at that  instant. We don't do it immediately, because it might be expensive.  Instead, we set
CanPartialClearThrough,and do it lazily in  ClearOldPredicateLocks(). But what is the purpose of
HavePartialClearThrough?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

"Kevin Grittner"

Date:

10 June 2011, 12:06:10

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:
> Here's a bunch of small issues that I spotted:
Thanks for going over it again.  It is encouraging that you didn't
spot any *big* issues.
> * The oldserxid code is broken for non-default BLCKSZ.
>    o The warning will come either too early or too late
Good point.  That part is easily fixed -- if we want to keep the
warning, in light of the next few points.
>    o There is no safeguard against actually wrapping around the
>      SLRU, just the warning
Any thoughts on what we should do instead?  If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error?  Of course, one other option would be to
allow more SLRU segment files, as you raised on another thread. 
Then this issue goes away because they would hit other, existing,
protections against transaction wraparound first.
>    o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
>      with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
>      than necessary to cover the whole range of 2^32 transactions,
>      so at high XIDs, say 2^32-1, doesn't it incorrectly think
>      that low XIDs, say 10, are in the past, not the future?
I will look that over to see; but if this is broken, then one of the
other SLRU users is probably broken, because I think I stole this
code pretty directly from another spot.
>> /*
>>  * Keep a pointer to the currently-running serializable
>>  * transaction (if any) for quick reference.
>>  * TODO SSI: Remove volatile qualifier and the then-unnecessary
>>  * casts?
>>  */
>> static volatile SERIALIZABLEXACT *MySerializableXact = 
>>                     InvalidSerializableXact;
> 
> * So, should we remove it? In most places where contents of
>    MySerializableXact are modified, we're holding
>    SerializableXactHashLock in exclusive mode. However, there's
>    two exceptions:
> 
>    o GetSafeSnapshot() modifies MySerializableXact->flags without
>      any lock. It also inspects
>      MySerializableXact->possibleUnsafeConflicts without a lock.
>      What if somone sets some other flag in the flags bitmap just
>      when GetSafeSnapshot() is about to set
>      SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost
>      :-(.
> 
>    o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE
>      without holding a lock. The same danger is here if someone
>      else tries to set some other flag concurrently.
> 
>    I think we should simply acquire SerializableXactHashLock in
>    GetSafeSnapshot(). It shouldn't be so much of a hotspot that it
>    would make any difference in performance.
>    CheckForSerializableConflictIn() might be called a lot,
>    however, so maybe we need to check if the flag is set first,
>    and only acquire the lock and set it if it's not set already.
OK.
Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes?  (That would also
allow dropping a number of casts which exist just to avoid
warnings.)
> * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in
>    ReleasePredicateLocks() for a fleeting moment while the
>    function releases all conflicts and locks held by the
>    transaction, and finally the sxact struct itself containing the
>    flag.
I think that one can go away.  It  had more of a point many months
ago before we properly sorted out what belongs in
PreCommit_CheckForSerializationFailure() and what belongs in
ReleasePredicateLocks().  The point at which we reached clarity on
that and moved things around, this flag probably became obsolete.
>    Also, isn't a transaction that's already been marked for death
>    the same as one that has already rolled back, for the purposes
>    of detecting conflicts?
Yes.
We should probably ignore any marked-for-death transaction during
conflict detection and serialization failure detection.  As a start,
anywhere there is now a check for rollback and not for this, we
should change it to this.  There may be some places this can be
checked which haven't yet been identified and touched.
> * The HavePartialClearThrough/CanPartialClearThrough mechanism
>   needs a better explanation. The only explanation currently is
>   this:
> 
>>  if (--(PredXact->WritableSxactCount) == 0)
>>  {
>>      /*
>>       * Release predicate locks and rw-conflicts in for
>>       * all committed transactions.  There are no longer any
>>       * transactions which might conflict with the locks and no
>>       * chance for new transactions to overlap.  Similarly,
>>       * existing conflicts in can't cause pivots, and any
>>       * conflicts in which could have completed a dangerous
>>       * structure would already have caused a rollback, so any
>>       * remaining ones must be benign.
>>       */
>>      PredXact->CanPartialClearThrough =
>>          PredXact->LastSxactCommitSeqNo;
>>  }
> 
>    If I understand that correctly, any predicate lock belonging to
>    any already committed transaction can be safely zapped away at
>    that instant.
Correct.
>    We don't do it immediately, because it might be expensive.
I think it has more to do with getting the LW locks right.  We make
the call to ClearOldPredicateLocks() farther down in the function,
so this is effectively setting things up for that call.
>    But what is the purpose of HavePartialClearThrough?
To avoid doing unnecessary work for completed transactions on which
we still need to keep some information but for which we were
previously able to clear predicate locks.  This relates to the
"mitigation" work discussed in these and other posts from around
that time:
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01754.php
http://archives.postgresql.org/pgsql-hackers/2010-12/msg02119.php
I'm happy to work on modifications for any of this or to stay out of
your way if you want to.  Should I put together a patch for those
items where we seem to agree and have a clear way forward?
-Kevin

Re: Small SSI issues

From

Heikki Linnakangas

Date:

10 June 2011, 15:44:11

On 10.06.2011 18:05, Kevin Grittner wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  wrote:
>>     o There is no safeguard against actually wrapping around the
>>       SLRU, just the warning
>
> Any thoughts on what we should do instead?  If someone holds open a
> transaction long enough to burn through a billion transaction IDs
> (or possibly less if someone uses a smaller BLCKSZ), should we
> generate a FATAL error?

FATAL is a bit harsh, ERROR seems more appropriate.

>>     o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
>>       with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
>>       than necessary to cover the whole range of 2^32 transactions,
>>       so at high XIDs, say 2^32-1, doesn't it incorrectly think
>>       that low XIDs, say 10, are in the past, not the future?
>
> I will look that over to see; but if this is broken, then one of the
> other SLRU users is probably broken, because I think I stole this
> code pretty directly from another spot.

It was copied from async.c, which doesn't have this problem because it's 
not mapping XIDs to the slru. There, the max queue size is determined by 
the *_MAX_PAGE, and you point to a particular location in the SLRU with 
simply page+offset.

>>> /*
>>>   * Keep a pointer to the currently-running serializable
>>>   * transaction (if any) for quick reference.
>>>   * TODO SSI: Remove volatile qualifier and the then-unnecessary
>>>   * casts?
>>>   */
>>> static volatile SERIALIZABLEXACT *MySerializableXact =
>>>                      InvalidSerializableXact;
>>
>> * So, should we remove it? In most places where contents of
>>     MySerializableXact are modified, we're holding
>>     SerializableXactHashLock in exclusive mode. However, there's
>>     two exceptions:
>>
>>     o GetSafeSnapshot() modifies MySerializableXact->flags without
>>       any lock. It also inspects
>>       MySerializableXact->possibleUnsafeConflicts without a lock.
>>       What if somone sets some other flag in the flags bitmap just
>>       when GetSafeSnapshot() is about to set
>>       SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost
>>       :-(.
>>
>>     o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE
>>       without holding a lock. The same danger is here if someone
>>       else tries to set some other flag concurrently.
>>
>>     I think we should simply acquire SerializableXactHashLock in
>>     GetSafeSnapshot(). It shouldn't be so much of a hotspot that it
>>     would make any difference in performance.
>>     CheckForSerializableConflictIn() might be called a lot,
>>     however, so maybe we need to check if the flag is set first,
>>     and only acquire the lock and set it if it's not set already.
>
> OK.
>
> Do checks such as that argue for keeping the volatile flag, or do
> you think we can drop it if we make those changes?  (That would also
> allow dropping a number of casts which exist just to avoid
> warnings.)

I believe we can drop it, I'll double-check.

> I'm happy to work on modifications for any of this or to stay out of
> your way if you want to.  Should I put together a patch for those
> items where we seem to agree and have a clear way forward?

I'll fix the MySerializableXact locking issue, and come back to the 
other issues on Monday. If you have the time and energy to write a patch 
by then, feel free, but I can look into them otherwise.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

Dan Ports

Date:

10 June 2011, 18:37:49

On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote:
> > Do checks such as that argue for keeping the volatile flag, or do
> > you think we can drop it if we make those changes?  (That would also
> > allow dropping a number of casts which exist just to avoid
> > warnings.)
> 
> I believe we can drop it, I'll double-check.

Yes, dropping it seems like the thing to do. It's been on my list for a
while. We are not really getting anything out of declaring it volatile
since we cast the volatile qualifier away most of the time.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/

Re: Small SSI issues

From

"Kevin Grittner"

Date:

11 June 2011, 15:38:45

> Dan Ports  wrote:
> On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote:
>>> Do checks such as that argue for keeping the volatile flag, or do
>>> you think we can drop it if we make those changes? (That would
>>> also allow dropping a number of casts which exist just to avoid
>>> warnings.)
>>
>> I believe we can drop it, I'll double-check.
> 
> Yes, dropping it seems like the thing to do. It's been on my list
> for a while. We are not really getting anything out of declaring it
> volatile since we cast the volatile qualifier away most of the
> time.
I'm not concerned about references covered by
SerializableXactHashLock.  I am more concerned about some of the
tests for whether the (MySerializableXact == InvalidSerializableXact)
checks and any other tests not covered by that lock are OK without it
(and OK with it).  Since my knowledge of weak memory ordering
behavior is, well, weak I didn't want to try to make that call.
-Kevin

Re: Small SSI issues

From

Dan Ports

Date:

11 June 2011, 17:02:20

On Sat, Jun 11, 2011 at 01:38:31PM -0500, Kevin Grittner wrote:
> I'm not concerned about references covered by
> SerializableXactHashLock.  I am more concerned about some of the
> tests for whether the (MySerializableXact == InvalidSerializableXact)
> checks and any other tests not covered by that lock are OK without it
> (and OK with it).  Since my knowledge of weak memory ordering
> behavior is, well, weak I didn't want to try to make that call.

Oh, those checks are definitely not an issue -- MySerializableXact
itself (the pointer, not the thing it's pointing to) is in
backend-local memory, so it won't change under us.

The volatile qualifier (as written) doesn't help with that anyway, it
attaches to the data being pointed to, not the pointer itself.

Dan

-- 
Dan R. K. Ports              MIT CSAIL                http://drkp.net/

Re: Small SSI issues

From

Heikki Linnakangas

Date:

12 June 2011, 17:01:24

On 12.06.2011 17:59, Kevin Grittner wrote:
>> Heikki Linnakangas  wrote:
>> On 10.06.2011 18:05, Kevin Grittner wrote:
>>> Heikki Linnakangas wrote:
>>>> o There is no safeguard against actually wrapping around the
>>>> SLRU, just the warning
>>>
>>> Any thoughts on what we should do instead? If someone holds open a
>>> transaction long enough to burn through a billion transaction IDs
>>> (or possibly less if someone uses a smaller BLCKSZ), should we
>>> generate a FATAL error?
>>
>> FATAL is a bit harsh, ERROR seems more appropriate.
>
> If we don't cancel the long-running transaction, don't we continue to
> have a problem?

Yes, but ERROR is enough to kill the transaction. Unless it's in a 
subtransaction, I guess. But anyway, there's no guarantee that the 
long-running transaction will hit the problem first, or at all. You're 
much more likely to kill an innocent new transaction that tries to 
acquire its first locks, than an old long-running transaction that's 
been running for a while already, probably idle doing nothing, or doing 
a long seqscan on a large table with the whole table locked already.

> I see you committed a patch for this, but there were some casts which
> become unnecessary with that change that you missed.  Patch attached
> to clean up the ones I could spot.

Ah, thanks, applied. I didn't realize those were also only needed 
because of the volatile qualifier.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

Heikki Linnakangas

Date:

15 June 2011, 07:47:45

On 10.06.2011 18:05, Kevin Grittner wrote:
> Heikki Linnakangas<heikki.linnakangas@enterprisedb.com>  wrote:
>> * Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in
>>     ReleasePredicateLocks() for a fleeting moment while the
>>     function releases all conflicts and locks held by the
>>     transaction, and finally the sxact struct itself containing the
>>     flag.
>
> I think that one can go away.  It  had more of a point many months
> ago before we properly sorted out what belongs in
> PreCommit_CheckForSerializationFailure() and what belongs in
> ReleasePredicateLocks().  The point at which we reached clarity on
> that and moved things around, this flag probably became obsolete.
>
>>     Also, isn't a transaction that's already been marked for death
>>     the same as one that has already rolled back, for the purposes
>>     of detecting conflicts?
>
> Yes.
>
> We should probably ignore any marked-for-death transaction during
> conflict detection and serialization failure detection.  As a start,
> anywhere there is now a check for rollback and not for this, we
> should change it to this.

Ok, I removed the SXACT_FLAG_ROLLED_BACK flag. I also renamed the 
marked-for-death flag into SXACT_FLAG_DOOMED; that's a lot shorter.

> There may be some places this can be
> checked which haven't yet been identified and touched.

Yeah - in 9.2.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

"Kevin Grittner"

Date:

15 June 2011, 13:11:28

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

>> There may be some places this can be checked which haven't yet
>> been identified and touched.
>
> Yeah - in 9.2.

No argument here. I'm all for stabilizing and getting the thing out
-- I think we've established that performance is good for many
workloads as it stands, and that there are workloads where it will
never be useful. Chipping away at the gray area, to make it perform
well in a few workloads where it currently doesn't (and, of course,
*even better* in workloads where it's currently better than the
alternatives), seems like future release work to me.

There is one issue you raised in this post:

http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com

Robert questioned whether it should be 9.1 material here:

http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com

I posted a patch here:

http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov

Should I put that patch into a 9.2 CF?

There is an unnecessary include of predicate.h in nbtree.c we should
delete. That seems safe enough.

You questioned whether OldSerXidPagePrecedesLogically was buggy. I
will look at that by this weekend at the latest. If it is buggy we
obviously should fix that. Are there any other changes you think we
should make to handle the odd corner cases in the SLRU for SSI? It
did occur to me that we should be safe from actual overwriting of an
old entry by the normal transaction wrap-around protections -- the
worst that should happen with the current code (I think) is that in
extreme cases we may get LOG level messages or accumulate a
surprising number of SLRU segment files. That's because SLRU will
start to nag about things at one billion transactions, but we need
to get all the way to two billion transactions used up while a
single serializable transaction remains active before we could
overwrite something.

It seems like it might be a good idea to apply pgindent formating to
the latest SSI changes, to minimize conflict on back-patching any
bug fixes. I've attached a patch to do this formatting -- entirely
whitespace changes from a pgindent run against selected files.

Unless I'm missing something, the only remaining changes needed are
for documentation (as mentioned in previous posts). I will work on
those after I look at OldSerXidPagePrecedesLogically.

-Kevin

Attachment

ssi-pgindent-post-beta2.patch

Re: Small SSI issues

From

"Kevin Grittner"

Date:

15 June 2011, 15:14:38

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> Unless I'm missing something, the only remaining changes needed
> are for documentation (as mentioned in previous posts).
I just found notes that we also need regression tests for the
SSI/DDL combination and a comment in lazy_truncate_heap.
At any rate, not anything which is part of executable code....
-Kevin

Re: Small SSI issues

From

Heikki Linnakangas

Date:

16 June 2011, 10:19:02

On 15.06.2011 19:10, Kevin Grittner wrote:
> There is an unnecessary include of predicate.h in nbtree.c we should
> delete.  That seems safe enough.
>...
> It seems like it might be a good idea to apply pgindent formating to
> the latest SSI changes, to minimize conflict on back-patching any
> bug fixes.  I've attached a patch to do this formatting -- entirely
> whitespace changes from a pgindent run against selected files.

Ok, committed the pgindent patch and removed the unnecessary include.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

Heikki Linnakangas

Date:

16 June 2011, 10:22:45

On 15.06.2011 19:10, Kevin Grittner wrote:
> There is one issue you raised in this post:
>
> http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com
>
> Robert questioned whether it should be 9.1 material here:
>
> http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com
>
> I posted a patch here:
>
> http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov
>
> Should I put that patch into a 9.2 CF?

Yeah. I've added it to the September commitfest.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Small SSI issues

From

"Kevin Grittner"

Date:

19 June 2011, 20:17:58

Heikki Linnakangas wrote:

> * The oldserxid code is broken for non-default BLCKSZ.
> o The warning will come either too early or too late
> o There is no safeguard against actually wrapping around the
> SLRU, just the warning
> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
> than necessary to cover the whole range of 2^32 transactions,
> so at high XIDs, say 2^32-1, doesn't it incorrectly think that
> low XIDs, say 10, are in the past, not the future?

It took me a while to see these problems because somehow I had
forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather than
being based on BLCKSZ. After I rediscovered that, your concern was
clear enough.

I think the attached patch addresses the problem with the
OldSerXidPagePrecedesLogically() function, which strikes me as the
most serious issue.

Based on the calculation from the attached patch, it would be easy to
adjust the warning threshold, but I'm wondering if we should just rip
it out instead. As I mentioned in a recent post based on reviewing
your concerns, with an 8KB BLCKSZ the SLRU system will start thinking
we're into wraparound at one billion transactions, and refuse to
truncate segment files until we get down to less than that, but we
won't actually overwrite anything and cause SSI misbehaviors until we
eat through two billion (2^31 really) transactions while holding open
a single read-write transaction. At that point I think PostgreSQL
has other defenses which come into play. With the attached patch I
don't think we can have any such problems with a *larger* BLCKSZ, so
the only point of the warning would be for a BLCKSZ of 4KB or less.
Is it worth carrying the warning code (with an appropriate adjustment
to the thresholds) or should we drop it?

-Kevin

Attachment

ssi-slru-maxpage.patch

Re: Small SSI issues

From

Robert Haas

Date:

05 July 2011, 11:10:45

On Sun, Jun 19, 2011 at 7:17 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Heikki Linnakangas wrote:
>
>> * The oldserxid code is broken for non-default BLCKSZ.
>> o The warning will come either too early or too late
>> o There is no safeguard against actually wrapping around the
>> SLRU, just the warning
>> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
>> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
>> than necessary to cover the whole range of 2^32 transactions,
>> so at high XIDs, say 2^32-1, doesn't it incorrectly think that
>> low XIDs, say 10, are in the past, not the future?
>
> It took me a while to see these problems because somehow I had
> forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather than
> being based on BLCKSZ. After I rediscovered that, your concern was
> clear enough.
>
> I think the attached patch addresses the problem with the
> OldSerXidPagePrecedesLogically() function, which strikes me as the
> most serious issue.
>
> Based on the calculation from the attached patch, it would be easy to
> adjust the warning threshold, but I'm wondering if we should just rip
> it out instead. As I mentioned in a recent post based on reviewing
> your concerns, with an 8KB BLCKSZ the SLRU system will start thinking
> we're into wraparound at one billion transactions, and refuse to
> truncate segment files until we get down to less than that, but we
> won't actually overwrite anything and cause SSI misbehaviors until we
> eat through two billion (2^31 really) transactions while holding open
> a single read-write transaction. At that point I think PostgreSQL
> has other defenses which come into play. With the attached patch I
> don't think we can have any such problems with a *larger* BLCKSZ, so
> the only point of the warning would be for a BLCKSZ of 4KB or less.
> Is it worth carrying the warning code (with an appropriate adjustment
> to the thresholds) or should we drop it?

Is this still an open item?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Small SSI issues

From

"Kevin Grittner"

Date:

05 July 2011, 11:52:09

Robert Haas <robertmhaas@gmail.com> wrote: 
> On Sun, Jun 19, 2011 at 7:17 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> Heikki Linnakangas wrote:
>>
>>> * The oldserxid code is broken for non-default BLCKSZ.
>>> o The warning will come either too early or too late
>>> o There is no safeguard against actually wrapping around the
>>> SLRU, just the warning
>>> o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
>>> with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
>>> than necessary to cover the whole range of 2^32 transactions,
>>> so at high XIDs, say 2^32-1, doesn't it incorrectly think that
>>> low XIDs, say 10, are in the past, not the future?
>>
>> It took me a while to see these problems because somehow I had
>> forgotten that SLRU_PAGES_PER_SEGMENT was a literal of 32 rather
>> than being based on BLCKSZ. After I rediscovered that, your
>> concern was clear enough.
>>
>> I think the attached patch addresses the problem with the
>> OldSerXidPagePrecedesLogically() function, which strikes me as
>> the most serious issue.
>>
>> Based on the calculation from the attached patch, it would be
>> easy to adjust the warning threshold, but I'm wondering if we
>> should just rip it out instead. As I mentioned in a recent post
>> based on reviewing your concerns, with an 8KB BLCKSZ the SLRU
>> system will start thinking we're into wraparound at one billion
>> transactions, and refuse to truncate segment files until we get
>> down to less than that, but we won't actually overwrite anything
>> and cause SSI misbehaviors until we eat through two billion (2^31
>> really) transactions while holding open a single read-write
>> transaction. At that point I think PostgreSQL has other defenses
>> which come into play. With the attached patch I don't think we
>> can have any such problems with a *larger* BLCKSZ, so the only
>> point of the warning would be for a BLCKSZ of 4KB or less. Is it
>> worth carrying the warning code (with an appropriate adjustment
>> to the thresholds) or should we drop it?
> 
> Is this still an open item?
Yes, although I'm not clear on whether people feel it is one which
needs to be fixed for 9.1 or left for 9.2.
On a build with a BLCKSZ less than 8KB we would not get a warning
before problems occurred, and we would have more serious problem
involving potentially incorrect behavior.  Tom questioned whether
anyone actually builds with BLCKSZ less than 8KB, and I'm not
altogether sure that SLRUs dealing with transaction IDs would behave
correctly either.
On block sizes larger than 8KB it will the warning if you burn
through one billion transactions while holding one serializable read
write transaction open, even though there won't be a problem.  With
the larger BLCKSZ values it may also generate log level messages
about SLRU wraparound when that's not really a problem.
The patch posted with the quoted message will prevent the
misbehavior on smaller block sizes and the bogus log messages on
larger block sizes.  We'd need to change a couple more lines to get
the warning to trigger at the appropriate time for different block
sizes -- or we could just rip out the warning code.  (I didn't post
a patch for that because there wasn't a clear consensus about
whether to fix it, rip it out, or leave it alone for 9.1.)
-Kevin

Re: Small SSI issues

From

Robert Haas

Date:

05 July 2011, 12:08:07

On Tue, Jul 5, 2011 at 10:51 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
>> Is this still an open item?
>
> Yes, although I'm not clear on whether people feel it is one which
> needs to be fixed for 9.1 or left for 9.2.
>
> On a build with a BLCKSZ less than 8KB we would not get a warning
> before problems occurred, and we would have more serious problem
> involving potentially incorrect behavior.  Tom questioned whether
> anyone actually builds with BLCKSZ less than 8KB, and I'm not
> altogether sure that SLRUs dealing with transaction IDs would behave
> correctly either.
>
> On block sizes larger than 8KB it will the warning if you burn
> through one billion transactions while holding one serializable read
> write transaction open, even though there won't be a problem.  With
> the larger BLCKSZ values it may also generate log level messages
> about SLRU wraparound when that's not really a problem.

Well, as long as we can verify that OLDSERXID_MAX_PAGE has the same
value for BLCKSZ=8K before and after this patch, I don't see any real
downside to applying it.  If, hypothetically, it's buggy, it's only
going to break things for non-default block sizes which are, by your
description, not correct right now anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Small SSI issues

From

"Kevin Grittner"

Date:

05 July 2011, 13:03:43

Robert Haas <robertmhaas@gmail.com> wrote:
> Well, as long as we can verify that OLDSERXID_MAX_PAGE has the
> same value for BLCKSZ=8K before and after this patch, I don't see
> any real downside to applying it.  If, hypothetically, it's buggy,
> it's only going to break things for non-default block sizes which
> are, by your description, not correct right now anyway.
Outside of a code comment, the entire patch consists of replacing
the definition of the OLDSERXID_MAX_PAGE macro.  The old definition
is: (SLRU_PAGES_PER_SEGMENT * 0x10000 - 1)
SLRU_PAGES_PER_SEGMENT is define to be 32.  So this is: (32 * 0x10000) - 1 = 2097151
The new definition is the min of the old one and a value based on
BLCKSZ: (MaxTransactionId + 1) / OLDSERXID_ENTRIESPERPAGE - 1)
Where "entries per page" is BLCKSZ / sizeof(uint64).
For an 8K BLCKSZ this is:  ((0xffffffff + 1) / 1024) - 1 = 4194303
So the macro is guaranteed to have the same value as it currently
does for BLCKSZ of 16KB or lower.
-Kevin

Re: Small SSI issues

From

Robert Haas

Date:

07 July 2011, 16:10:04

On Tue, Jul 5, 2011 at 12:03 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:
>
>> Well, as long as we can verify that OLDSERXID_MAX_PAGE has the
>> same value for BLCKSZ=8K before and after this patch, I don't see
>> any real downside to applying it.  If, hypothetically, it's buggy,
>> it's only going to break things for non-default block sizes which
>> are, by your description, not correct right now anyway.
>
> Outside of a code comment, the entire patch consists of replacing
> the definition of the OLDSERXID_MAX_PAGE macro.  The old definition
> is:
>
>  (SLRU_PAGES_PER_SEGMENT * 0x10000 - 1)
>
> SLRU_PAGES_PER_SEGMENT is define to be 32.  So this is:
>
>  (32 * 0x10000) - 1 = 2097151
>
> The new definition is the min of the old one and a value based on
> BLCKSZ:
>
>  (MaxTransactionId + 1) / OLDSERXID_ENTRIESPERPAGE - 1)
>
> Where "entries per page" is BLCKSZ / sizeof(uint64).
>
> For an 8K BLCKSZ this is:
>
>  ((0xffffffff + 1) / 1024) - 1 = 4194303
>
> So the macro is guaranteed to have the same value as it currently
> does for BLCKSZ of 16KB or lower.

I went ahead and committed this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company