Thread: HOT updates & REDIRECT line pointers

HOT updates & REDIRECT line pointers

From
Robert Haas
Date:
When the root tuple of a HOT chain is dead, but there's still at least
one non-dead member of the chain, we end up with a REDIRECT line
pointer, which points to a USED line pointer, which in turn points to
a live tuple.  This means we're using 2 line pointers for only 1 line
tuple.  Since line pointers are fairly small, that's not a
catastrophe, but I wonder if it might be possible to do better.

Specifically, I'm wondering if we couldn't get away with rearranging
things so that the root line pointer (which has index entries) points
to the actual tuple, and the other line pointer (which can't have any
index entries) gets marked UNUSED.  In other words, we essentially
bequeath the live tuple that is in effect the current root of the
chain to the original line pointer, and then recycle the line pointer
that formerly referenced that tuple.

Now, the question is, is this safe?  Could someone, for example,
release a pin on the page in the middle of walking a HOT chain and
then reacquire the pin to walk the rest of the chain?  If they did,
they might miss a visible tuple altogether.  I think we typically keep
the pin until we're done with the page in such situations, in which
case it might be safe.  But I also think we've typically viewed that
as a performance optimization rather than as something that's
necessary for correctness.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates & REDIRECT line pointers

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Specifically, I'm wondering if we couldn't get away with rearranging
> things so that the root line pointer (which has index entries) points
> to the actual tuple, and the other line pointer (which can't have any
> index entries) gets marked UNUSED.

This would amount to changing the TID of the live row.  In the past we
have considered that that can only be allowed when holding an exclusive
lock on the table (a la old-style VACUUM FULL).  An example of the sort
of situation where it'd be dangerous is an UPDATE involving a join,
which might read a tuple-to-be-updated (including its TID), then release
pin on that page for awhile while going about its business with the
join, and eventually expect to come back and find the tuple still at the
same TID.  I believe there are applications that similarly expect a TID
that they've fetched to remain valid for as long as they're holding some
type of lock on the table or row.
        regards, tom lane


Re: HOT updates & REDIRECT line pointers

From
Tom Lane
Date:
I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Specifically, I'm wondering if we couldn't get away with rearranging
>> things so that the root line pointer (which has index entries) points
>> to the actual tuple, and the other line pointer (which can't have any
>> index entries) gets marked UNUSED.

> This would amount to changing the TID of the live row.

Another issue, quite independent from race conditions against other
observers of the row, is what if the tuple is part of an update chain?
You have no way to find the predecessor row version and update its
t_ctid forward link.
        regards, tom lane


Re: HOT updates & REDIRECT line pointers

From
Robert Haas
Date:
On Wed, Mar 21, 2012 at 8:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Specifically, I'm wondering if we couldn't get away with rearranging
>>> things so that the root line pointer (which has index entries) points
>>> to the actual tuple, and the other line pointer (which can't have any
>>> index entries) gets marked UNUSED.
>
>> This would amount to changing the TID of the live row.
>
> Another issue, quite independent from race conditions against other
> observers of the row, is what if the tuple is part of an update chain?
> You have no way to find the predecessor row version and update its
> t_ctid forward link.

I don't see why I need to.  The predecessor, if any, points to the
root of the HOT chain; and that's exactly the TID that I'm proposing
to keep around.  The heap-only tuple's TID gets canned, but nobody can
be pointing to that from outside the block, IIUC.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates & REDIRECT line pointers

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 21, 2012 at 8:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Another issue, quite independent from race conditions against other
>> observers of the row, is what if the tuple is part of an update chain?
>> You have no way to find the predecessor row version and update its
>> t_ctid forward link.

> I don't see why I need to.  The predecessor, if any, points to the
> root of the HOT chain;

Oh, right.  So scratch that objection.  The other one is still fatal
though ...
        regards, tom lane


Re: HOT updates & REDIRECT line pointers

From
Robert Haas
Date:
On Wed, Mar 21, 2012 at 8:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Oh, right.  So scratch that objection.  The other one is still fatal
> though ...

So, could we just decide that we don't care about preserving that
property any more, and document it as an incompatibility in whatever
release we break it in?  It strikes me that it likely wouldn't be any
worse than, oh, say, flipping the default value of
standard_conforming_strings, and we could even have a backward
compatibility GUC if we were so inclined.  I realize that the
standard_conforming_strings change was dictated by a desire to conform
to the SQL standards, and this isn't, but it seems awfully painful to
me to insist that this is a property that we can never give up.

I can remember one other proposal to which you raised this same
objection: the idea of an on-line tuple mover to handle the situation
where a user wishes to do an on-line reorganization of a bloated table
by incrementally moving tuples to lower-numbered pages.  It's possible
that the idea with which I started this thread might turn out to be
horribly unsafe for one reason or another, but I think that there's
broad support for the tuple mover concept.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates & REDIRECT line pointers

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Mar 21, 2012 at 8:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Oh, right.  So scratch that objection.  The other one is still fatal
>> though ...

> So, could we just decide that we don't care about preserving that
> property any more, and document it as an incompatibility in whatever
> release we break it in?

No, I don't think so.  Especially not for such a picayune benefit as
getting rid of one item pointer a bit sooner.

> It strikes me that it likely wouldn't be any
> worse than, oh, say, flipping the default value of
> standard_conforming_strings,

Really?  It's taking away functionality and not supplying any substitute
(or at least you did not propose any).  In fact, you didn't even suggest
exactly how you propose to not break joined UPDATE/DELETE.
        regards, tom lane


Re: HOT updates & REDIRECT line pointers

From
Robert Haas
Date:
On Wed, Mar 21, 2012 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It strikes me that it likely wouldn't be any
>> worse than, oh, say, flipping the default value of
>> standard_conforming_strings,
>
> Really?  It's taking away functionality and not supplying any substitute
> (or at least you did not propose any).  In fact, you didn't even suggest
> exactly how you propose to not break joined UPDATE/DELETE.

Oh, hmm, interesting.  I had been thinking that you were talking about
a case where *user code* was relying on the semantics of the TID,
which has always struck me as an implementation detail that users
probably shouldn't get too attached to.  But now I see that you're
talking about something much more basic - the fundamental
implementation of UPDATE and DELETE relies on the TID not changing
under them.  That pretty much kills this idea dead in the water.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates & REDIRECT line pointers

From
Simon Riggs
Date:
On Thu, Mar 22, 2012 at 1:28 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 21, 2012 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It strikes me that it likely wouldn't be any
>>> worse than, oh, say, flipping the default value of
>>> standard_conforming_strings,
>>
>> Really?  It's taking away functionality and not supplying any substitute
>> (or at least you did not propose any).  In fact, you didn't even suggest
>> exactly how you propose to not break joined UPDATE/DELETE.
>
> Oh, hmm, interesting.  I had been thinking that you were talking about
> a case where *user code* was relying on the semantics of the TID,
> which has always struck me as an implementation detail that users
> probably shouldn't get too attached to.  But now I see that you're
> talking about something much more basic - the fundamental
> implementation of UPDATE and DELETE relies on the TID not changing
> under them.  That pretty much kills this idea dead in the water.

Surely it just stops you using that idea 100% of the time. I don't see
why you can't have this co-exist with the current mechanism. So it
doesn't kill it for the common case.

But would the idea deliver much value? Is line pointer bloat a
problem? (I have no idea if it is/is not)

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: HOT updates & REDIRECT line pointers

From
Merlin Moncure
Date:
On Wed, Mar 21, 2012 at 8:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 21, 2012 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It strikes me that it likely wouldn't be any
>>> worse than, oh, say, flipping the default value of
>>> standard_conforming_strings,
>>
>> Really?  It's taking away functionality and not supplying any substitute
>> (or at least you did not propose any).  In fact, you didn't even suggest
>> exactly how you propose to not break joined UPDATE/DELETE.
>
> Oh, hmm, interesting.  I had been thinking that you were talking about
> a case where *user code* was relying on the semantics of the TID,
> which has always struck me as an implementation detail that users
> probably shouldn't get too attached to.

small aside: tid usage is the best method for kludging a delete/limit:
delete from del where ctid  = any (array(select ctid from del limit
10)); (via http://postgres.cz/wiki/PostgreSQL_SQL_Tricks)

merlin


Re: HOT updates & REDIRECT line pointers

From
Robert Haas
Date:
On Thu, Mar 22, 2012 at 9:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Surely it just stops you using that idea 100% of the time. I don't see
> why you can't have this co-exist with the current mechanism. So it
> doesn't kill it for the common case.

I guess you could use it if you knew that there were no DELETE or
UPDATE statements in progress on that table, but it seems like
figuring that out would be more trouble than it's worth.

> But would the idea deliver much value? Is line pointer bloat a
> problem? (I have no idea if it is/is not)

Good question.  I think that all things being equal it would be worth
getting rid of, but I'm not sure it's worth paying a lot for.  Maybe
when I have time I'll try to gather some statistics.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: HOT updates & REDIRECT line pointers

From
Pavan Deolasee
Date:


On Thu, Mar 22, 2012 at 6:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 21, 2012 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It strikes me that it likely wouldn't be any
>> worse than, oh, say, flipping the default value of
>> standard_conforming_strings,
>
> Really?  It's taking away functionality and not supplying any substitute
> (or at least you did not propose any).  In fact, you didn't even suggest
> exactly how you propose to not break joined UPDATE/DELETE.

Oh, hmm, interesting.  I had been thinking that you were talking about
a case where *user code* was relying on the semantics of the TID,
which has always struck me as an implementation detail that users
probably shouldn't get too attached to.  But now I see that you're
talking about something much more basic - the fundamental
implementation of UPDATE and DELETE relies on the TID not changing
under them.  That pretty much kills this idea dead in the water.


IIRC in early versions of HOT, I tried to swap the TIDs of newer versions with the older version to handle this problem, but soon realized that it might turn out too tricky and error-prone. The UPDATE/DELETE problem and any other piece of code that works with TIDs and cache them across buffer lock/unlock could face the issue. But it will be worthwhile to revisit the issue and see if there is some easy way to reclaim those redirect line pointers. If the HOT chain does not become dead, there will always to that overhead of additional line pointer. 

Thanks,
Pavan

Re: HOT updates & REDIRECT line pointers

From
Bruce Momjian
Date:
On Wed, Mar 21, 2012 at 09:28:22PM -0400, Robert Haas wrote:
> On Wed, Mar 21, 2012 at 9:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> It strikes me that it likely wouldn't be any
> >> worse than, oh, say, flipping the default value of
> >> standard_conforming_strings,
> >
> > Really?  It's taking away functionality and not supplying any substitute
> > (or at least you did not propose any).  In fact, you didn't even suggest
> > exactly how you propose to not break joined UPDATE/DELETE.
> 
> Oh, hmm, interesting.  I had been thinking that you were talking about
> a case where *user code* was relying on the semantics of the TID,
> which has always struck me as an implementation detail that users
> probably shouldn't get too attached to.  But now I see that you're
> talking about something much more basic - the fundamental
> implementation of UPDATE and DELETE relies on the TID not changing
> under them.  That pretty much kills this idea dead in the water.

Should this information be added to src/backend/access/heap/README.HOT?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +