Thread: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>>> Tom, any chance we can change the name of setheapoverried to something
>>>> that makes sense?
>> 
>> Actually, I thought the plan was to eliminate it entirely in favor of
>> using CommandCounterIncrement when we need to make tuples visible.
>> There was a thread about that back in September, but I guess no one's
>> gotten around to actually doing it.

> I remember in the old days being totally confused about its purpose. 
> That was my motivation to change it.  Can I do something to help fix
> this?

Actually, according to my notes I had put off doing anything with this
because Hiroshi pointed out that CommandCounterIncrement had a shared-
cache-invalidation problem (it sent SI messages for changes that we
couldn't yet be sure would get committed).

Hiroshi's message last Monday stated that he'd fixed that problem,
so maybe now it's safe to start using CommandCounterIncrement more
heavily.  Hiroshi, what do you think --- do you trust
CommandCounterIncrement now?
        regards, tom lane


RE: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> 
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >>>> Tom, any chance we can change the name of setheapoverried to 
> something
> >>>> that makes sense?
> >> 
> >> Actually, I thought the plan was to eliminate it entirely in favor of
> >> using CommandCounterIncrement when we need to make tuples visible.
> >> There was a thread about that back in September, but I guess no one's
> >> gotten around to actually doing it.
> 
> > I remember in the old days being totally confused about its purpose. 
> > That was my motivation to change it.  Can I do something to help fix
> > this?
> 
> Actually, according to my notes I had put off doing anything with this
> because Hiroshi pointed out that CommandCounterIncrement had a shared-
> cache-invalidation problem (it sent SI messages for changes that we
> couldn't yet be sure would get committed).
> 
> Hiroshi's message last Monday stated that he'd fixed that problem,
> so maybe now it's safe to start using CommandCounterIncrement more
> heavily.  Hiroshi, what do you think --- do you trust
> CommandCounterIncrement now?
>

Oh,I was just looking at heapoverride stuff quite accidentally. 
Yes, this call is ugly and should be replaced by CommandCounterIncrement().

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp


"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> Oh,I was just looking at heapoverride stuff quite accidentally. 
> Yes, this call is ugly and should be replaced by CommandCounterIncrement().

OK, I'm running a build now with setheapoverride calls removed.
Will see what happens.

About half of the setheapoverride calls surrounded heap_update()
(formerly called heap_replace()) calls.  AFAICS there is no need
for these calls unless heap_update itself needs them --- but there
are many calls to heap_update that do not have setheapoverride.
Perhaps heap_replace once needed setheapoverride but no longer does?

I am going to try just removing these calls without adding a
CommandCounterIncrement to replace them.  If anyone knows that
this is a bad idea, let me know!
        regards, tom lane


Re: [HACKERS] RE: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
Bruce Momjian
Date:
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > Oh,I was just looking at heapoverride stuff quite accidentally. 
> > Yes, this call is ugly and should be replaced by CommandCounterIncrement().
> 
> OK, I'm running a build now with setheapoverride calls removed.
> Will see what happens.
> 
> About half of the setheapoverride calls surrounded heap_update()
> (formerly called heap_replace()) calls.  AFAICS there is no need
> for these calls unless heap_update itself needs them --- but there
> are many calls to heap_update that do not have setheapoverride.
> Perhaps heap_replace once needed setheapoverride but no longer does?
> 
> I am going to try just removing these calls without adding a
> CommandCounterIncrement to replace them.  If anyone knows that
> this is a bad idea, let me know!

Go for it. The setheapoverride name was so confusing, people just
probably left it in, not knowing what it did.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


>>>> Oh,I was just looking at heapoverride stuff quite accidentally. 
>>>> Yes, this call is ugly and should be replaced by CommandCounterIncrement()
>> 
>> OK, I'm running a build now with setheapoverride calls removed.
>> Will see what happens.

Well, it seems to work, but...

>> About half of the setheapoverride calls surrounded heap_update()
>> (formerly called heap_replace()) calls.  AFAICS there is no need
>> for these calls unless heap_update itself needs them --- but there
>> are many calls to heap_update that do not have setheapoverride.

I figured out that the cases where setheapoverride (or, now,
CommandCounterIncrement) were needed were the cases where the
heap_update might be updating a tuple created earlier in the
same command.  pg_operator.c has some cases like that, but many of
the other uses of setheapoverride seem to be unnecessary.

However, I'm a little uncomfortable with committing this change,
because my first try at it worked OK at creating things but fell
right over on DROP TABLE.  I had replaced the setheapoverride(true)
in heap_drop_with_catalog() (backend/catalog/heap.c) with a
CommandCounterIncrement, and it failed with this backtrace:

(gdb) bt
#0  elog (lev=-1, fmt=0x6b160 "cannot find attribute %d of relation %s")   at elog.c:112
#1  0x1767d8 in build_tupdesc_ind (buildinfo={infotype = 1, i = {       info_id = 19040,       info_name = 0x4a60}},
relation=0x4006ef00,natts=1074198864) at relcache.c:527
 
#2  0x176554 in RelationBuildTupleDesc (buildinfo={infotype = 1, i = {       info_id = 19040,       info_name = 0x4a60
}},  relation=0x1, natts=1074198864) at relcache.c:437
 
#3  0x177230 in RelationBuildDesc (buildinfo={infotype = 1, i = {       info_id = 19040,       info_name = 0x4a60 }},
oldrelation=0x4006ef00)at relcache.c:808
 
#4  0x177b28 in RelationClearRelation (relation=0x4006ef00, rebuildIt=0 '\000')   at relcache.c:1279
#5  0x177bbc in RelationFlushRelation (relationPtr=0xffffffff,   onlyFlushReferenceCountZero=96 '`') at
relcache.c:1320
#6  0x177e10 in RelationIdInvalidateRelationCacheByRelationId (   relationId=19040) at relcache.c:1415
#7  0x175968 in CacheIdInvalidate (cacheId=4294967295, hashIndex=438624,   pointer=0x1) at inval.c:544
#8  0x175ae8 in InvalidationMessageCacheInvalidate (message=0x4007cce4)   at inval.c:657
#9  0x175490 in LocalInvalidInvalidate (invalid=0x4007cce4 "r",   function=0x4000c3ca <DINFINITY+9226>, freemember=1
'\001')at inval.c:173
 
#10 0x175ca4 in ImmediateLocalInvalidation (send=-1 '�') at inval.c:806
#11 0x9d0b0 in AtCommit_LocalCache () at xact.c:687
#12 0x9cf70 in CommandCounterIncrement () at xact.c:520
#13 0xa7a08 in heap_drop_with_catalog (relname=0x4006ef00 "����")   at heap.c:1528
#14 0xb16ac in RemoveRelation (name=0x40083328 "ff1") at creatinh.c:217
#15 0x13ce84 in ProcessUtility (parsetree=0x40083370, dest=Remote)   at utility.c:201
#16 0x13add0 in pg_exec_query_dest (query_string=0x40024270 "drop table ff1",   dest=Remote, aclOverride=1 '\001') at
postgres.c:721

Apparently, if I call CommandCounterIncrement while partway through
dropping a relation, it will try to rebuild the relcache entry for
the relation --- and of course fail.  I'm too tired to figure out
whether this is just a small coding error in the new cache invalidation
code or whether it's a serious limitation in the whole design.  Hiroshi,
what do you think?

I was able to get around this by simply removing CommandCounterIncrement
from heap_drop_with_catalog entirely --- dropping tables seems to work
fine that way ... but I don't trust it ...
        regards, tom lane


RE: [HACKERS] RE: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>
> >>>> Oh,I was just looking at heapoverride stuff quite accidentally.
> >>>> Yes, this call is ugly and should be replaced by
> CommandCounterIncrement()
> >>
> >> OK, I'm running a build now with setheapoverride calls removed.
> >> Will see what happens.
>
> Well, it seems to work, but...
>
> >> About half of the setheapoverride calls surrounded heap_update()
> >> (formerly called heap_replace()) calls.  AFAICS there is no need
> >> for these calls unless heap_update itself needs them --- but there
> >> are many calls to heap_update that do not have setheapoverride.
>
> I figured out that the cases where setheapoverride (or, now,
> CommandCounterIncrement) were needed were the cases where the
> heap_update might be updating a tuple created earlier in the
> same command.  pg_operator.c has some cases like that, but many of
> the other uses of setheapoverride seem to be unnecessary.
>
> However, I'm a little uncomfortable with committing this change,
> because my first try at it worked OK at creating things but fell
> right over on DROP TABLE.  I had replaced the setheapoverride(true)
> in heap_drop_with_catalog() (backend/catalog/heap.c) with a
> CommandCounterIncrement, and it failed with this backtrace:
>
> (gdb) bt
> #0  elog (lev=-1, fmt=0x6b160 "cannot find attribute %d of relation %s")
>     at elog.c:112
> #1  0x1767d8 in build_tupdesc_ind (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60}},
>     relation=0x4006ef00, natts=1074198864) at relcache.c:527
> #2  0x176554 in RelationBuildTupleDesc (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60 }},
>     relation=0x1, natts=1074198864) at relcache.c:437
> #3  0x177230 in RelationBuildDesc (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60 }},
>     oldrelation=0x4006ef00) at relcache.c:808
> #4  0x177b28 in RelationClearRelation (relation=0x4006ef00,
> rebuildIt=0 '\000')
>     at relcache.c:1279
> #5  0x177bbc in RelationFlushRelation (relationPtr=0xffffffff,
>     onlyFlushReferenceCountZero=96 '`') at relcache.c:1320
> #6  0x177e10 in RelationIdInvalidateRelationCacheByRelationId (
>     relationId=19040) at relcache.c:1415
> #7  0x175968 in CacheIdInvalidate (cacheId=4294967295, hashIndex=438624,
>     pointer=0x1) at inval.c:544
> #8  0x175ae8 in InvalidationMessageCacheInvalidate (message=0x4007cce4)
>     at inval.c:657
> #9  0x175490 in LocalInvalidInvalidate (invalid=0x4007cce4 "r",
>     function=0x4000c3ca <DINFINITY+9226>, freemember=1 '\001') at
> inval.c:173
> #10 0x175ca4 in ImmediateLocalInvalidation (send=-1 '\xA0') at inval.c:806
> #11 0x9d0b0 in AtCommit_LocalCache () at xact.c:687
> #12 0x9cf70 in CommandCounterIncrement () at xact.c:520
> #13 0xa7a08 in heap_drop_with_catalog (relname=0x4006ef00 "\xA0\xA0\xA0\xA0")
>     at heap.c:1528
> #14 0xb16ac in RemoveRelation (name=0x40083328 "ff1") at creatinh.c:217
> #15 0x13ce84 in ProcessUtility (parsetree=0x40083370, dest=Remote)
>     at utility.c:201
> #16 0x13add0 in pg_exec_query_dest (query_string=0x40024270 "drop
> table ff1",
>     dest=Remote, aclOverride=1 '\001') at postgres.c:721
>
> Apparently, if I call CommandCounterIncrement while partway through
> dropping a relation, it will try to rebuild the relcache entry for
> the relation --- and of course fail.  I'm too tired to figure out
> whether this is just a small coding error in the new cache invalidation
> code or whether it's a serious limitation in the whole design.  Hiroshi,
> what do you think?
>

Hmmm,CommandCounterIncrement() was about to rebuild relation decriptor
for the half dropped table and failed. It seems impossible to call Command-
CounterIncrement() in heap_drop_with_catalog.

I don't understand why DeleteTypeTuple() require setheapoverride().
Does anyone know ?
If no one knows,we had better remove setheapoverride().

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp



Re: [HACKERS] RE: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
Bruce Momjian
Date:
> >> About half of the setheapoverride calls surrounded heap_update()
> >> (formerly called heap_replace()) calls.  AFAICS there is no need
> >> for these calls unless heap_update itself needs them --- but there
> >> are many calls to heap_update that do not have setheapoverride.
> 
> I figured out that the cases where setheapoverride (or, now,
> CommandCounterIncrement) were needed were the cases where the
> heap_update might be updating a tuple created earlier in the
> same command.  pg_operator.c has some cases like that, but many of
> the other uses of setheapoverride seem to be unnecessary.

I thought about that this morning and suspected this may be the case,
though I thought tuples would be visible to the same transaction
automatically.  Hard to imagine why we would not want such visibility in
all cases.


> 
> However, I'm a little uncomfortable with committing this change,
> because my first try at it worked OK at creating things but fell
> right over on DROP TABLE.  I had replaced the setheapoverride(true)
> in heap_drop_with_catalog() (backend/catalog/heap.c) with a
> CommandCounterIncrement, and it failed with this backtrace:
> 
> (gdb) bt
> #0  elog (lev=-1, fmt=0x6b160 "cannot find attribute %d of relation %s")
>     at elog.c:112
> #1  0x1767d8 in build_tupdesc_ind (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60}},
>     relation=0x4006ef00, natts=1074198864) at relcache.c:527
> #2  0x176554 in RelationBuildTupleDesc (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60 }},
>     relation=0x1, natts=1074198864) at relcache.c:437
> #3  0x177230 in RelationBuildDesc (buildinfo={infotype = 1, i = {
>         info_id = 19040,
>         info_name = 0x4a60 }},
>     oldrelation=0x4006ef00) at relcache.c:808
> #4  0x177b28 in RelationClearRelation (relation=0x4006ef00, rebuildIt=0 '\000')
>     at relcache.c:1279
> #5  0x177bbc in RelationFlushRelation (relationPtr=0xffffffff,
>     onlyFlushReferenceCountZero=96 '`') at relcache.c:1320
> #6  0x177e10 in RelationIdInvalidateRelationCacheByRelationId (
>     relationId=19040) at relcache.c:1415
> #7  0x175968 in CacheIdInvalidate (cacheId=4294967295, hashIndex=438624,
>     pointer=0x1) at inval.c:544
> #8  0x175ae8 in InvalidationMessageCacheInvalidate (message=0x4007cce4)
>     at inval.c:657
> #9  0x175490 in LocalInvalidInvalidate (invalid=0x4007cce4 "r",
>     function=0x4000c3ca <DINFINITY+9226>, freemember=1 '\001') at inval.c:173
> #10 0x175ca4 in ImmediateLocalInvalidation (send=-1 '�') at inval.c:806
> #11 0x9d0b0 in AtCommit_LocalCache () at xact.c:687
> #12 0x9cf70 in CommandCounterIncrement () at xact.c:520
> #13 0xa7a08 in heap_drop_with_catalog (relname=0x4006ef00 "����")
>     at heap.c:1528
> #14 0xb16ac in RemoveRelation (name=0x40083328 "ff1") at creatinh.c:217
> #15 0x13ce84 in ProcessUtility (parsetree=0x40083370, dest=Remote)
>     at utility.c:201
> #16 0x13add0 in pg_exec_query_dest (query_string=0x40024270 "drop table ff1",
>     dest=Remote, aclOverride=1 '\001') at postgres.c:721
> 
> Apparently, if I call CommandCounterIncrement while partway through
> dropping a relation, it will try to rebuild the relcache entry for
> the relation --- and of course fail.  I'm too tired to figure out
> whether this is just a small coding error in the new cache invalidation
> code or whether it's a serious limitation in the whole design.  Hiroshi,
> what do you think?
> 
> I was able to get around this by simply removing CommandCounterIncrement
> from heap_drop_with_catalog entirely --- dropping tables seems to work
> fine that way ... but I don't trust it ...

With buggy code like that, it seems we could just try removing them all,
and adding them back in as part of beta testing as people find problems.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
>> Apparently, if I call CommandCounterIncrement while partway through
>> dropping a relation, it will try to rebuild the relcache entry for
>> the relation --- and of course fail.  I'm too tired to figure out
>> whether this is just a small coding error in the new cache invalidation
>> code or whether it's a serious limitation in the whole design.  Hiroshi,
>> what do you think?

> Hmmm,CommandCounterIncrement() was about to rebuild relation decriptor
> for the half dropped table and failed. It seems impossible to call Command-
> CounterIncrement() in heap_drop_with_catalog.

> I don't understand why DeleteTypeTuple() require setheapoverride().

As far as I can tell, it doesn't --- drop table seems to work just fine
without it.

I have been thinking some more about this, and have come to the
conclusion that it is only safe to call CommandCounterIncrement
at points where you have a self-consistent catalog tuple state.
In particular it must be possible to build valid relcache entries
from whatever tuples you are making visible with the increment.

For example, it's OK for heap_create to call C.C.I. after creating the
relation's pg_class, pg_type, and pg_attribute entries; the relation
is not finished, since it lacks default and constraint info, but
relcache.c won't have a problem with that.  (Note that heap_create
explicitly forces a rebuild of the relcache entry after it's added
that extra stuff!)

It is *not* OK for heap_drop to call C.C.I. where it was doing it,
because it had already deleted the pg_attribute tuples, but was still
holding a refcount lock on the relcache entry for the target relation.
(If the refcount were zero, then relcache.c would have just dropped
the entry instead of trying to rebuild it...)

The heap_drop code was risky even in its original form of
setheapoverride, since had a relcache rebuild been caused during
DeleteTypeTuple(), it would have failed.  (This could happen,
in the current state of the code, if an SI Reset message arrives
and gets processed while DeleteTypeTuple is trying to open pg_type.)
Switching to CommandCounterIncrement just exposed the latent bug
by forcing the rebuild attempt to occur.

In short, I have convinced myself that this is all fine.  I will
finish ripping out setheapoverride and commit the changes tonight.
Should be able to simplify tqual.c a little bit now that we don't
need the override code.
        regards, tom lane


Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I figured out that the cases where setheapoverride (or, now,
>> CommandCounterIncrement) were needed were the cases where the
>> heap_update might be updating a tuple created earlier in the
>> same command.  pg_operator.c has some cases like that, but many of
>> the other uses of setheapoverride seem to be unnecessary.

> I thought about that this morning and suspected this may be the case,
> though I thought tuples would be visible to the same transaction
> automatically.  Hard to imagine why we would not want such visibility in
> all cases.

Normally you *don't* want tuples created/updated in the current command
to be visible.  Consider an UPDATE proceeding by sequential scan.  As it
finds tuples it needs to update, the updated versions of those tuples
will get added to the end of the relation.  Eventually the UPDATE will
reach those tuples and be scanning its own output!  Thanks to the
visibility rule, it will ignore those new tuples as not-yet-visible.
Without that, something as simple as "UPDATE t SET f = f + 1" would be
an infinite loop.

CommandCounterIncrement() is like a statement boundary inside a
transaction: after you call it, you can see the effects of your
prior operation (but no one else can; it's not a commit).
        regards, tom lane


Re: [HACKERS] RE: Getting rid of setheapoverride (was Re: [COMMITTERS] heap.c)

From
Bruce Momjian
Date:
> As far as I can tell, it doesn't --- drop table seems to work just fine
> without it.
> 
> I have been thinking some more about this, and have come to the
> conclusion that it is only safe to call CommandCounterIncrement
> at points where you have a self-consistent catalog tuple state.
> In particular it must be possible to build valid relcache entries
> from whatever tuples you are making visible with the increment.

This is a good analysis.

> 
> For example, it's OK for heap_create to call C.C.I. after creating the
> relation's pg_class, pg_type, and pg_attribute entries; the relation
> is not finished, since it lacks default and constraint info, but
> relcache.c won't have a problem with that.  (Note that heap_create
> explicitly forces a rebuild of the relcache entry after it's added
> that extra stuff!)
> 
> It is *not* OK for heap_drop to call C.C.I. where it was doing it,
> because it had already deleted the pg_attribute tuples, but was still
> holding a refcount lock on the relcache entry for the target relation.
> (If the refcount were zero, then relcache.c would have just dropped
> the entry instead of trying to rebuild it...)
> 
> The heap_drop code was risky even in its original form of
> setheapoverride, since had a relcache rebuild been caused during
> DeleteTypeTuple(), it would have failed.  (This could happen,
> in the current state of the code, if an SI Reset message arrives
> and gets processed while DeleteTypeTuple is trying to open pg_type.)
> Switching to CommandCounterIncrement just exposed the latent bug
> by forcing the rebuild attempt to occur.

This is an excellent point.  We know we have some instability in
creating/droping tables in separate sessions at the same time.  This may
not fix that, but it is clearly an issue that an SI message could arrive
at that time.

> 
> In short, I have convinced myself that this is all fine.  I will
> finish ripping out setheapoverride and commit the changes tonight.
> Should be able to simplify tqual.c a little bit now that we don't
> need the override code.

I know I am responsible for at least one of those function calls.  I
remember asking about it in the past.  I have added my first two emails
about this below.  I may have added it whenever I did heap_update
because I never knew what it did, and the name was confusing to me.

---------------------------------------------------------------------------

From maillist Fri Aug 14 22:22:06 1998Date: Fri, 14 Aug 1998 22:22:06 -0400 (EDT)X-Mailer: ELM [version 2.4ME+ PL43
(25)]MIME-Version:1.0Content-Type: text/plain; charset=US-ASCIIContent-Transfer-Encoding: 7bitXFMstatus: 0000To:
(PostgreSQL-development)<hackers@postgreSQL.org>Subject: setheapoverrideContent-Length:   346Status: ROCan someone tell
mewhat setheapoverride() does?  I see it aroundheap_replace a lot.
 

---------------------------------------------------------------------------
Date: Sat, 18 Sep 1999 17:25:43 -0400 (EDT)X-Mailer: ELM [version 2.4ME+ PL56 (25)]MIME-Version: 1.0Content-Type:
text/plain;charset=US-ASCIIContent-Transfer-Encoding: 7bitXFMstatus: 0000To: Tom Lane <tgl@sss.pgh.pa.us>Subject: Re:
[HACKERS]setheapoverride() considered harmfulCc: pgsql-hackers@postgreSQL.orgContent-Length:   628Status: RO> I think
weneed to get rid of setheapoverride().I have always wondered what it did.  It is in my personal TODO with
aquestionmark. Never figured out its purpose.> since this way the tuples still look valid if we look at them again>
laterin the same command.> > Comments?  Anyone know a reason not to get rid of setheapoverride?Yes, please remove it.
 

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026