Thread: minor fix for acquire_inherited_sample_rows
Hi. acquire_inherited_sample_rows() currently uses equalTupleDescs() being false as the condition for going to tupconv.c to determine whether tuple conversion is needed. But equalTupleDescs() will always return false if it's passed TupleDesc's of two different tables, which is the most common case here. So I first thought we should just unconditionally go to tupconv.c, but there is still one case where we don't need to, which is the case where the child table is same as the parent table. However, it would be much cheaper to just check if the relation OIDs are different instead of calling equalTupleDescs, which the attached patch teaches it to do. Thanks, Amit
Attachment
On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Hi. > > acquire_inherited_sample_rows() currently uses equalTupleDescs() being > false as the condition for going to tupconv.c to determine whether tuple > conversion is needed. But equalTupleDescs() will always return false if > it's passed TupleDesc's of two different tables, which is the most common > case here. So I first thought we should just unconditionally go to > tupconv.c, but there is still one case where we don't need to, which is > the case where the child table is same as the parent table. However, it > would be much cheaper to just check if the relation OIDs are different > instead of calling equalTupleDescs, which the attached patch teaches it to do. We want to check whether tuple conversion is needed. Theoretically (not necessarily practically), one could have tuple descs of two different tables stamped with the same tdtypeid. From that POV, equalTupleDescs seems to be a stronger check than what you have in the patch. BTW the patch you have posted also has a fix you proposed in some other thread. Probably you want to get rid of it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Thanks for the review. On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Hi. >> >> acquire_inherited_sample_rows() currently uses equalTupleDescs() being >> false as the condition for going to tupconv.c to determine whether tuple >> conversion is needed. But equalTupleDescs() will always return false if >> it's passed TupleDesc's of two different tables, which is the most common >> case here. So I first thought we should just unconditionally go to >> tupconv.c, but there is still one case where we don't need to, which is >> the case where the child table is same as the parent table. However, it >> would be much cheaper to just check if the relation OIDs are different >> instead of calling equalTupleDescs, which the attached patch teaches it to do. > > We want to check whether tuple conversion is needed. Theoretically > (not necessarily practically), one could have tuple descs of two > different tables stamped with the same tdtypeid. From that POV, > equalTupleDescs seems to be a stronger check than what you have in the > patch. If I'm reading right, that sounds like a +1 to the patch. :) > BTW the patch you have posted also has a fix you proposed in some > other thread. Probably you want to get rid of it. Oops, fixed that in the attached. Thanks, Amit
Attachment
Hello Amit Amit Langote wrote: > acquire_inherited_sample_rows() currently uses equalTupleDescs() being > false as the condition for going to tupconv.c to determine whether tuple > conversion is needed. But equalTupleDescs() will always return false if > it's passed TupleDesc's of two different tables, which is the most common > case here. So I first thought we should just unconditionally go to > tupconv.c, but there is still one case where we don't need to, which is > the case where the child table is same as the parent table. However, it > would be much cheaper to just check if the relation OIDs are different > instead of calling equalTupleDescs, which the attached patch teaches it to do. When (not if) we get around to updating equalTupleDescs to cope, we will need this call right where it is (and we'd have a hard time finding this potential callsite later, I think). If this were a hot spot I would be happy to change it, but it's not so I'd rather leave it alone. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Apr 23, 2018 at 6:45 PM, Amit Langote <amitlangote09@gmail.com> wrote: > Thanks for the review. > > On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: >> On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote >> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>> Hi. >>> >>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being >>> false as the condition for going to tupconv.c to determine whether tuple >>> conversion is needed. But equalTupleDescs() will always return false if >>> it's passed TupleDesc's of two different tables, which is the most common >>> case here. So I first thought we should just unconditionally go to >>> tupconv.c, but there is still one case where we don't need to, which is >>> the case where the child table is same as the parent table. However, it >>> would be much cheaper to just check if the relation OIDs are different >>> instead of calling equalTupleDescs, which the attached patch teaches it to do. >> >> We want to check whether tuple conversion is needed. Theoretically >> (not necessarily practically), one could have tuple descs of two >> different tables stamped with the same tdtypeid. From that POV, >> equalTupleDescs seems to be a stronger check than what you have in the >> patch. > > If I'm reading right, that sounds like a +1 to the patch. :) Not really! To make things clear, I am not in favour of this patch. > >> BTW the patch you have posted also has a fix you proposed in some >> other thread. Probably you want to get rid of it. > > Oops, fixed that in the attached. Thanks. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Mon, Apr 23, 2018 at 8:46 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Hello Amit > > Amit Langote wrote: > >> acquire_inherited_sample_rows() currently uses equalTupleDescs() being >> false as the condition for going to tupconv.c to determine whether tuple >> conversion is needed. But equalTupleDescs() will always return false if >> it's passed TupleDesc's of two different tables, which is the most common >> case here. So I first thought we should just unconditionally go to >> tupconv.c, but there is still one case where we don't need to, which is >> the case where the child table is same as the parent table. However, it >> would be much cheaper to just check if the relation OIDs are different >> instead of calling equalTupleDescs, which the attached patch teaches it to do. > > When (not if) we get around to updating equalTupleDescs to cope, we will > need this call right where it is (and we'd have a hard time finding this > potential callsite later, I think). If this were a hot spot I would be > happy to change it, but it's not so I'd rather leave it alone. > +1 -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 2018/04/24 13:29, Ashutosh Bapat wrote: > On Mon, Apr 23, 2018 at 6:45 PM, Amit Langote <amitlangote09@gmail.com> wrote: >> On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat wrote: >>> On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote wrote: >>>> Hi. >>>> >>>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being >>>> false as the condition for going to tupconv.c to determine whether tuple >>>> conversion is needed. But equalTupleDescs() will always return false if >>>> it's passed TupleDesc's of two different tables, which is the most common >>>> case here. So I first thought we should just unconditionally go to >>>> tupconv.c, but there is still one case where we don't need to, which is >>>> the case where the child table is same as the parent table. However, it >>>> would be much cheaper to just check if the relation OIDs are different >>>> instead of calling equalTupleDescs, which the attached patch teaches it to do. >>> >>> We want to check whether tuple conversion is needed. Theoretically >>> (not necessarily practically), one could have tuple descs of two >>> different tables stamped with the same tdtypeid. From that POV, >>> equalTupleDescs seems to be a stronger check than what you have in the >>> patch. >> >> If I'm reading right, that sounds like a +1 to the patch. :) > > Not really! To make things clear, I am not in favour of this patch. Oh okay. I read your last sentence as "equalTupleDescs() seems to be a stronger check than needed" (which is how I see it), but apparently that's not what you meant to say. Anyway, thanks for clarifying. Regards, Amit
Hi. On 2018/04/24 0:16, Alvaro Herrera wrote: > Hello Amit > > Amit Langote wrote: > >> acquire_inherited_sample_rows() currently uses equalTupleDescs() being >> false as the condition for going to tupconv.c to determine whether tuple >> conversion is needed. But equalTupleDescs() will always return false if >> it's passed TupleDesc's of two different tables, which is the most common >> case here. So I first thought we should just unconditionally go to >> tupconv.c, but there is still one case where we don't need to, which is >> the case where the child table is same as the parent table. However, it >> would be much cheaper to just check if the relation OIDs are different >> instead of calling equalTupleDescs, which the attached patch teaches it to do. > > When (not if) we get around to updating equalTupleDescs to cope, we will > need this call right where it is (and we'd have a hard time finding this > potential callsite later, I think). Given what equalTupleDescs was invented for (commit a152ebeec), reducing it down to what can be sensibly used for checking whether tuple conversion is needed between a parent and child will, I'm afraid, make it useless for its original purpose. Just looking at a few checks that are now in it, for example: for (i = 0; i < tupdesc1->natts; i++) { Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i); Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i); <...snip...> if (attr1->attislocal != attr2->attislocal) return false; if (attr1->attinhcount != attr2->attinhcount) return false; <...snip...> /* attacl, attoptions and attfdwoptions are not even present... */ } attislocal and attinhcount obviously can't be same for parent and a child. Also, the comment above seems to assume that this function is being called from the only place it was designed for (RelationClearRelation). Further down: if (tupdesc1->constr != NULL) { TupleConstr *constr1 = tupdesc1->constr; TupleConstr *constr2 = tupdesc2->constr; <...snip...> n = constr1->num_defval; if (n != (int) constr2->num_defval) return false; for (i = 0; i < n; i++) { AttrDefault *defval1 = constr1->defval + i; AttrDefault *defval2 = constr2->defval; <...snip...> if (strcmp(defval1->adbin, defval2->adbin) != 0) return false; Now this last one will always return false, because the location *appears* to be different in this usage. We'll have to believe that this test returns true in at least the cases for which equalTupleDescs() was designed for. Then there is code further down that checks other details being equal like constr->matching, constr->check, etc. For CHECK constraints, we again check ccbin, which might have different location values, their inheritance status, etc. To summarize, I think equalTupleDescs()'s implementation relies on lots of details to match between passed-in descriptors (down to basically everything) for them to be considered "logically" equal. Not to mention that "logically" here is for relcache.c's or RelationClearRelation()'s purposes. We're considering using this to determine if we need to convert tuples between parent and child/partition and for that we only need to check if attributes of the same name appear in the same physical position in both parent and child. Rest of the details are either guaranteed same (individual attributes's types, typemod, typcollate, etc.) or irrelevant to comparison (their atthasdef, default expressions, check constraints, etc.). I proposed to implement such a logic in tupconv.c itself [1], which I'll re-propose for PG 12. > If this were a hot spot I would be > happy to change it, but it's not so I'd rather leave it alone. Sure, but if we adopt some solution to optimize the check for whether we need to convert tuples for a hot spot, say, ExecInitPartitionInfo(), then I'd imagine we'd want to use the same solution everywhere. Thanks, Amit [1] https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp
On Tue, Apr 24, 2018 at 6:19 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > Given what equalTupleDescs was invented for (commit a152ebeec), reducing > it down to what can be sensibly used for checking whether tuple conversion > is needed between a parent and child will, I'm afraid, make it useless for > its original purpose. Just looking at a few checks that are now in it, > for example: > > for (i = 0; i < tupdesc1->natts; i++) > { > Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i); > Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i); > > <...snip...> > > if (attr1->attislocal != attr2->attislocal) > return false; > if (attr1->attinhcount != attr2->attinhcount) > return false; > <...snip...> > /* attacl, attoptions and attfdwoptions are not even present... */ > } > > attislocal and attinhcount obviously can't be same for parent and a child. > Also, the comment above seems to assume that this function is being > called from the only place it was designed for (RelationClearRelation). +1. I think we're really abusing equalTupleDescs() for purposes for which it was not invented. Instead of changing it, let's invent a new function that tests for the thing partitioning cares about (same ordering of the same columns with the same type information) and call it logicallyEqualTupleDescs() or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Apr 24, 2018 at 6:19 AM, Amit Langote > <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> Given what equalTupleDescs was invented for (commit a152ebeec), reducing >> it down to what can be sensibly used for checking whether tuple conversion >> is needed between a parent and child will, I'm afraid, make it useless for >> its original purpose. Just looking at a few checks that are now in it, >> for example: >> >> for (i = 0; i < tupdesc1->natts; i++) >> { >> Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i); >> Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i); >> >> <...snip...> >> >> if (attr1->attislocal != attr2->attislocal) >> return false; >> if (attr1->attinhcount != attr2->attinhcount) >> return false; >> <...snip...> >> /* attacl, attoptions and attfdwoptions are not even present... */ >> } >> >> attislocal and attinhcount obviously can't be same for parent and a child. >> Also, the comment above seems to assume that this function is being >> called from the only place it was designed for (RelationClearRelation). > > +1. I think we're really abusing equalTupleDescs() for purposes for > which it was not invented. Instead of changing it, let's invent a new > function that tests for the thing partitioning cares about (same > ordering of the same columns with the same type information) and call > it logicallyEqualTupleDescs() or something like that. Why don't we just rely on the output of convert_tuples_by_name(), which it seems is always called right now? What's advantage of adding another tuple descriptor comparison? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> +1. I think we're really abusing equalTupleDescs() for purposes for >> which it was not invented. Instead of changing it, let's invent a new >> function that tests for the thing partitioning cares about (same >> ordering of the same columns with the same type information) and call >> it logicallyEqualTupleDescs() or something like that. > > Why don't we just rely on the output of convert_tuples_by_name(), > which it seems is always called right now? What's advantage of adding > another tuple descriptor comparison? The patch I mentioned in my email above does more or less that (what you're saying we should do). In fact it even modifies convert_tuple_by_name and convert_tuple_by_name_map to remove some redundant computation. See that patch here if you're interested: https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp Thanks, Amit
Amit Langote wrote: > On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: > > On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: > >> +1. I think we're really abusing equalTupleDescs() for purposes for > >> which it was not invented. Instead of changing it, let's invent a new > >> function that tests for the thing partitioning cares about (same > >> ordering of the same columns with the same type information) and call > >> it logicallyEqualTupleDescs() or something like that. > > > > Why don't we just rely on the output of convert_tuples_by_name(), > > which it seems is always called right now? What's advantage of adding > > another tuple descriptor comparison? > > The patch I mentioned in my email above does more or less that (what > you're saying we should do). In fact it even modifies > convert_tuple_by_name and convert_tuple_by_name_map to remove some > redundant computation. See that patch here if you're interested: > > https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp Yeah, I didn't like the additional nullness checks in that patch. I was thinking it might be saner to create a new struct with the AttrNumber * array, its length and a boolean flag indicating a no-op conversion. Then we can call map_variable_attnos unconditionally and it does nothing if the flag is set. This localizes the ugliness. I'm not sure if this idea is better than Robert's logicallyEqualTupleDescs(). Maybe we can use both in different places. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Apr 26, 2018 at 7:08 PM, Amit Langote <amitlangote09@gmail.com> wrote: > On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: >> On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> +1. I think we're really abusing equalTupleDescs() for purposes for >>> which it was not invented. Instead of changing it, let's invent a new >>> function that tests for the thing partitioning cares about (same >>> ordering of the same columns with the same type information) and call >>> it logicallyEqualTupleDescs() or something like that. >> >> Why don't we just rely on the output of convert_tuples_by_name(), >> which it seems is always called right now? What's advantage of adding >> another tuple descriptor comparison? > > The patch I mentioned in my email above does more or less that (what > you're saying we should do). In fact it even modifies > convert_tuple_by_name and convert_tuple_by_name_map to remove some > redundant computation. See that patch here if you're interested: > > https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp I spent some time looking at the patch. The patch clubs all kinds of refactoring together, making review a bit difficult. I think, it would be better to split the patch into multiple, each addressing one set of changes, it might become easier to review. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On 2018/04/27 22:42, Ashutosh Bapat wrote: > On Thu, Apr 26, 2018 at 7:08 PM, Amit Langote <amitlangote09@gmail.com> wrote: >> On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat >> <ashutosh.bapat@enterprisedb.com> wrote: >>> On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>>> +1. I think we're really abusing equalTupleDescs() for purposes for >>>> which it was not invented. Instead of changing it, let's invent a new >>>> function that tests for the thing partitioning cares about (same >>>> ordering of the same columns with the same type information) and call >>>> it logicallyEqualTupleDescs() or something like that. >>> >>> Why don't we just rely on the output of convert_tuples_by_name(), >>> which it seems is always called right now? What's advantage of adding >>> another tuple descriptor comparison? >> >> The patch I mentioned in my email above does more or less that (what >> you're saying we should do). In fact it even modifies >> convert_tuple_by_name and convert_tuple_by_name_map to remove some >> redundant computation. See that patch here if you're interested: >> >> https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp > > I spent some time looking at the patch. The patch clubs all kinds of > refactoring together, making review a bit difficult. I think, it would > be better to split the patch into multiple, each addressing one set of > changes, it might become easier to review. Thanks Ashutosh for looking at it. I will think of a way to break it up when I re-propose it for the next cycle. Regards, Amit