Thread: pointless check in RelationBuildPartitionDesc
I noticed this strange hack in RelationBuildPartitionDesc: /* * It is possible that the pg_class tuple of a partition has not been * updated yet to set its relpartbound field. The only case where * this happens is when we open the parent relation to check using its * partition descriptor that a new partition's bound does not overlap * some existing partition. */ if (!((Form_pg_class) GETSTRUCT(tuple))->relispartition) { ReleaseSysCache(tuple); continue; } After looking, it seems that this is just self-inflicted pain: for some reason, we store the pg_inherits row for a partition, and immediately afterwards compute and store its partition bound, which requires the above hack. But if we do things in the opposite order, this is no longer needed. I propose to remove it, as in the attached patch. -- Álvaro Herrera Developer, https://www.PostgreSQL.org/
Attachment
On 2018/09/04 6:39, Alvaro Herrera wrote: > I noticed this strange hack in RelationBuildPartitionDesc: > > /* > * It is possible that the pg_class tuple of a partition has not been > * updated yet to set its relpartbound field. The only case where > * this happens is when we open the parent relation to check using its > * partition descriptor that a new partition's bound does not overlap > * some existing partition. > */ > if (!((Form_pg_class) GETSTRUCT(tuple))->relispartition) > { > ReleaseSysCache(tuple); > continue; > } > > After looking, it seems that this is just self-inflicted pain: for some > reason, we store the pg_inherits row for a partition, and immediately > afterwards compute and store its partition bound, which requires the > above hack. But if we do things in the opposite order, this is no > longer needed. I propose to remove it, as in the attached patch. +1. I remember having facepalmed at this before and had also written a patch but never got around to submitting it. Thanks, Amit
On Tue, Sep 04, 2018 at 09:47:07AM +0900, Amit Langote wrote: > On 2018/09/04 6:39, Alvaro Herrera wrote: >> After looking, it seems that this is just self-inflicted pain: for some >> reason, we store the pg_inherits row for a partition, and immediately >> afterwards compute and store its partition bound, which requires the >> above hack. But if we do things in the opposite order, this is no >> longer needed. I propose to remove it, as in the attached patch. > > +1. I remember having facepalmed at this before and had also written a > patch but never got around to submitting it. Ok, I see. It seems to me that this could be replaced by an elog(ERROR), as relispartition ought to be set anyway. This way any future callers would get things done in the correct order. -- Michael
Attachment
On 2018/09/04 10:19, Michael Paquier wrote: > On Tue, Sep 04, 2018 at 09:47:07AM +0900, Amit Langote wrote: >> On 2018/09/04 6:39, Alvaro Herrera wrote: >>> After looking, it seems that this is just self-inflicted pain: for some >>> reason, we store the pg_inherits row for a partition, and immediately >>> afterwards compute and store its partition bound, which requires the >>> above hack. But if we do things in the opposite order, this is no >>> longer needed. I propose to remove it, as in the attached patch. >> >> +1. I remember having facepalmed at this before and had also written a >> patch but never got around to submitting it. > > Ok, I see. It seems to me that this could be replaced by an > elog(ERROR), as relispartition ought to be set anyway. This way any > future callers would get things done in the correct order. Converting it to elog(ERROR, ...) might be a good idea. Thanks, Amit
On 2018-Sep-04, Amit Langote wrote: > On 2018/09/04 10:19, Michael Paquier wrote: > > On Tue, Sep 04, 2018 at 09:47:07AM +0900, Amit Langote wrote: > >> On 2018/09/04 6:39, Alvaro Herrera wrote: > >>> After looking, it seems that this is just self-inflicted pain: for some > >>> reason, we store the pg_inherits row for a partition, and immediately > >>> afterwards compute and store its partition bound, which requires the > >>> above hack. But if we do things in the opposite order, this is no > >>> longer needed. I propose to remove it, as in the attached patch. > >> > >> +1. I remember having facepalmed at this before and had also written a > >> patch but never got around to submitting it. > > > > Ok, I see. It seems to me that this could be replaced by an > > elog(ERROR), as relispartition ought to be set anyway. This way any > > future callers would get things done in the correct order. > > Converting it to elog(ERROR, ...) might be a good idea. I think it'd be pointless noise. If we really want to protect against that, I think we should promote the Assert for relpartbound's isnull flag into an if test. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/09/04 13:08, Alvaro Herrera wrote: > On 2018-Sep-04, Amit Langote wrote: > >> On 2018/09/04 10:19, Michael Paquier wrote: >>> On Tue, Sep 04, 2018 at 09:47:07AM +0900, Amit Langote wrote: >>>> On 2018/09/04 6:39, Alvaro Herrera wrote: >>>>> After looking, it seems that this is just self-inflicted pain: for some >>>>> reason, we store the pg_inherits row for a partition, and immediately >>>>> afterwards compute and store its partition bound, which requires the >>>>> above hack. But if we do things in the opposite order, this is no >>>>> longer needed. I propose to remove it, as in the attached patch. >>>> >>>> +1. I remember having facepalmed at this before and had also written a >>>> patch but never got around to submitting it. >>> >>> Ok, I see. It seems to me that this could be replaced by an >>> elog(ERROR), as relispartition ought to be set anyway. This way any >>> future callers would get things done in the correct order. >> >> Converting it to elog(ERROR, ...) might be a good idea. > > I think it'd be pointless noise. If we really want to protect against > that, I think we should promote the Assert for relpartbound's isnull > flag into an if test. So that we can elog(ERROR, ...) if isnull is true? If so, I agree, because that's what should've been done here anyway (for a value read from the disk that is). I think we should check relispartition then too. Thanks, Amit
On 2018-Sep-04, Amit Langote wrote: > On 2018/09/04 13:08, Alvaro Herrera wrote: > > I think it'd be pointless noise. If we really want to protect against > > that, I think we should promote the Assert for relpartbound's isnull > > flag into an if test. > > So that we can elog(ERROR, ...) if isnull is true? If so, I agree, > because that's what should've been done here anyway (for a value read from > the disk that is). Yes. I wonder now what's the value of using an Assert for this, BTW: if those are enabled, then we'll crash saying "this column is null and shouldn't!". If they are disabled, we'll instead crash (possibly in production) trying to decode the field. What was achieved? With an elog(ERROR) the reaction is the expected one: the current transaction is aborted, but the server continues to run. > I think we should check relispartition then too. Well, we don't check every single catalog flag in every possible code path just to ensure it's sane. I don't see any reason to do differently here. We rely heavily on the catalog contents to be correct, and install defenses against user-induced corruption that would cause us to crash; but going much further than that would be excessive IMO. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Proposed patch. Checking isnull in a elog(ERROR) is important, because the column is not marked NOT NULL. This is not true for other columns where we simply do Assert(!isnull). -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 2018/09/04 21:51, Alvaro Herrera wrote: > On 2018-Sep-04, Amit Langote wrote: > >> On 2018/09/04 13:08, Alvaro Herrera wrote: > >>> I think it'd be pointless noise. If we really want to protect against >>> that, I think we should promote the Assert for relpartbound's isnull >>> flag into an if test. >> >> So that we can elog(ERROR, ...) if isnull is true? If so, I agree, >> because that's what should've been done here anyway (for a value read from >> the disk that is). > > Yes. > > I wonder now what's the value of using an Assert for this, BTW: if those > are enabled, then we'll crash saying "this column is null and shouldn't!". > If they are disabled, we'll instead crash (possibly in production) > trying to decode the field. What was achieved? I can see how that the Assert was pointless all along. > With an elog(ERROR) the reaction is the expected one: the current > transaction is aborted, but the server continues to run. > >> I think we should check relispartition then too. > > Well, we don't check every single catalog flag in every possible code > path just to ensure it's sane. I don't see any reason to do differently > here. We rely heavily on the catalog contents to be correct, and > install defenses against user-induced corruption that would cause us to > crash; but going much further than that would be excessive IMO. Okay, sure anyway. Thanks, Amit
On 2018/09/05 1:50, Alvaro Herrera wrote: > Proposed patch. Checking isnull in a elog(ERROR) is important, because > the column is not marked NOT NULL. This is not true for other columns > where we simply do Assert(!isnull). Looks good. Thanks for taking care of other sites as well. @@ -14705,7 +14705,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name) (void) SysCacheGetAttr(RELOID, tuple, Anum_pg_class_relpartbound, &isnull); - Assert(!isnull); + if (isnull) + elog(ERROR, "null relpartbound for relation %u", + RelationGetRelid(partRel)); In retrospect, I'm not sure why this piece of code is here at all; maybe just remove the SycCacheGetAttr and Assert? Regards, Amit
On 2018-Sep-05, Amit Langote wrote: > On 2018/09/05 1:50, Alvaro Herrera wrote: > > Proposed patch. Checking isnull in a elog(ERROR) is important, because > > the column is not marked NOT NULL. This is not true for other columns > > where we simply do Assert(!isnull). > > Looks good. Thanks for taking care of other sites as well. > > @@ -14705,7 +14705,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name) > > (void) SysCacheGetAttr(RELOID, tuple, Anum_pg_class_relpartbound, > &isnull); > - Assert(!isnull); > + if (isnull) > + elog(ERROR, "null relpartbound for relation %u", > + RelationGetRelid(partRel)); > > In retrospect, I'm not sure why this piece of code is here at all; maybe > just remove the SycCacheGetAttr and Assert? Yeah, good idea, will do. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services