Thread: Properly mark NULL returns in numeric aggregates
Hi hackers, We found that several functions -- namely numeric_combine, numeric_avg_combine, numeric_poly_combine, and int8_avg_combine -- are returning NULL without signaling the nullity of datum in fcinfo.isnull. This is obscured by the fact that the only functions in core (finalfunc for various aggregates) that those return values feed into happen to tolerate (or rather, not quite distinguish) zero-but-not-NULL trans values. In Greenplum, this behavior becomes problematic because Greenplum serializes internal trans values before spilling the hash table. The serial functions (numeric_serialize and friends) are strict functions that will blow up when they are given null (either in the C sense or the SQL sense) inputs. In Postgres if we change hash aggregation in the future to spill the hash table (vis-à-vis the input tuples), this issues would manifest itself in the final aggregate because we'll serialize the combined (and likely incorrectly null) trans values. Please find attached a small patch fixing said issue. Originally reported by Denis Smirnov over at https://github.com/greenplum-db/gpdb/pull/9878 Cheers, Jesse and Deep
Attachment
Hi, On 2020-04-09 16:22:11 -0700, Jesse Zhang wrote: > We found that several functions -- namely numeric_combine, > numeric_avg_combine, numeric_poly_combine, and int8_avg_combine -- are > returning NULL without signaling the nullity of datum in fcinfo.isnull. > This is obscured by the fact that the only functions in core (finalfunc > for various aggregates) that those return values feed into happen to > tolerate (or rather, not quite distinguish) zero-but-not-NULL trans > values. Shouldn't these just be marked as strict? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2020-04-09 16:22:11 -0700, Jesse Zhang wrote: >> We found that several functions -- namely numeric_combine, >> numeric_avg_combine, numeric_poly_combine, and int8_avg_combine -- are >> returning NULL without signaling the nullity of datum in fcinfo.isnull. >> This is obscured by the fact that the only functions in core (finalfunc >> for various aggregates) that those return values feed into happen to >> tolerate (or rather, not quite distinguish) zero-but-not-NULL trans >> values. > Shouldn't these just be marked as strict? No, certainly not --- they need to be able to act on null inputs. The question is how careful do we need to be about representing null results as "real" nulls rather than NULL pointers. regards, tom lane
Hi Andres, On Fri, Apr 10, 2020 at 12:14 PM Andres Freund <andres@anarazel.de> wrote: > > Shouldn't these just be marked as strict? > Are you suggesting that because none of the corresponding trans functions (int8_avg_accum, int2_accum, and friends) ever output NULL? That's what we thought, but then I realized that an input to a comebine function is not necessarily an output from a trans function invocation: for example, when there is a "FILTER (WHERE ...)" clause that filters out every tuple in a group, the partial aggregate might just throw a NULL state for the final aggregate to combine. On the other hand, we examined the corresponding final functions (numeric_stddev_pop and friends), they all seem to carefully treat a NULL trans value the same as a "zero input" (as in, state.N == 0 && state.NaNcount ==0). That does suggest to me that it should be fine to declare those combine functions as strict (barring the restriction that they should not be STRICT, anybody remembers why?). Cheers, Jesse and Deep
Jesse Zhang <sbjesse@gmail.com> writes: > On the other hand, we examined the corresponding final functions > (numeric_stddev_pop and friends), they all seem to carefully treat a > NULL trans value the same as a "zero input" (as in, state.N == 0 && > state.NaNcount ==0). That does suggest to me that it should be fine to > declare those combine functions as strict (barring the restriction that > they should not be STRICT, anybody remembers why?). They can't be strict because the initial iteration needs to produce something from a null state and non-null input. nodeAgg's default behavior won't work for those because nodeAgg doesn't know how to copy a value of type "internal". regards, tom lane
On Fri, Apr 10, 2020 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > They can't be strict because the initial iteration needs to produce > something from a null state and non-null input. nodeAgg's default > behavior won't work for those because nodeAgg doesn't know how to > copy a value of type "internal". > > regards, tom lane Ah, I think I get it. A copy must happen because the input is likely in a shorter-lived memory context than the state, but nodeAgg's default behavior of copying a by-value datum won't really copy the object pointed to by the pointer wrapped in the datum of "internal" type, so we defer to the combine function. Am I right? Then it follows kinda naturally that those combine functions have been sloppy on arrival since commit 11c8669c0cc . Cheers, Jesse
Jesse Zhang <sbjesse@gmail.com> writes: > On Fri, Apr 10, 2020 at 3:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> They can't be strict because the initial iteration needs to produce >> something from a null state and non-null input. nodeAgg's default >> behavior won't work for those because nodeAgg doesn't know how to >> copy a value of type "internal". > Ah, I think I get it. A copy must happen because the input is likely in > a shorter-lived memory context than the state, but nodeAgg's default > behavior of copying a by-value datum won't really copy the object > pointed to by the pointer wrapped in the datum of "internal" type, so we > defer to the combine function. Am I right? Then it follows kinda > naturally that those combine functions have been sloppy on arrival since > commit 11c8669c0cc . Yeah, they're relying exactly on the assumption that nodeAgg is not going to try to copy a value declared "internal", and therefore they can be loosey-goosey about whether the value pointer is null or not. However, if you want to claim that that's wrong, you have to explain why it's okay for some other code to be accessing a value that's declared "internal". I'd say that the meaning of that is precisely "keepa u hands off". In the case at hand, the current situation is that we only expect the values returned by these combine functions to be read by the associated final functions, which are on board with the null-pointer representation of an empty result. Your argument is essentially that it should be possible to feed the values to the aggregate's associated serialization function as well. But the core code never does that, so I'm not convinced that we should add it to the requirements; we'd be unable to test it. regards, tom lane
On Tue, 14 Apr 2020 at 06:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, they're relying exactly on the assumption that nodeAgg is not > going to try to copy a value declared "internal", and therefore they > can be loosey-goosey about whether the value pointer is null or not. > However, if you want to claim that that's wrong, you have to explain > why it's okay for some other code to be accessing a value that's > declared "internal". I'd say that the meaning of that is precisely > "keepa u hands off". > > In the case at hand, the current situation is that we only expect the > values returned by these combine functions to be read by the associated > final functions, which are on board with the null-pointer representation > of an empty result. Your argument is essentially that it should be > possible to feed the values to the aggregate's associated serialization > function as well. But the core code never does that, so I'm not convinced > that we should add it to the requirements; we'd be unable to test it. Casting my mind back to when I originally wrote that code, I attempted to do so in such a way so that it could one day be used for a 3-stage aggregation. e.g Parallel Partial Aggregate -> Gather -> Combine Serial Aggregate on one node, then on some master node a Deserial Combine Finalize Aggregate. You're very right that we can't craft such a plan with today's master (We didn't even add a supporting enum for it in AggSplit). However, it does appear that there are extensions or forks out there which attempt to use the code in this way, so it would be good to not leave those people out in the cold regarding this. For testing, can't we just have an Assert() in advance_transition_function that verifies isnull matches the nullability of the return value for INTERNAL returning transfns? i.e, the attached I don't have a test case to hand that could cause this to fail, but it sounds like Jesse might. David
Attachment
Hi David, On Mon, Apr 13, 2020 at 10:46 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Tue, 14 Apr 2020 at 06:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Yeah, they're relying exactly on the assumption that nodeAgg is not > > going to try to copy a value declared "internal", and therefore they > > can be loosey-goosey about whether the value pointer is null or not. > > However, if you want to claim that that's wrong, you have to explain > > why it's okay for some other code to be accessing a value that's > > declared "internal". I'd say that the meaning of that is precisely > > "keepa u hands off". > > > > In the case at hand, the current situation is that we only expect the > > values returned by these combine functions to be read by the associated > > final functions, which are on board with the null-pointer representation > > of an empty result. Your argument is essentially that it should be > > possible to feed the values to the aggregate's associated serialization > > function as well. But the core code never does that, so I'm not convinced > > that we should add it to the requirements; we'd be unable to test it. > > Casting my mind back to when I originally wrote that code, I attempted > to do so in such a way so that it could one day be used for a 3-stage > aggregation. e.g Parallel Partial Aggregate -> Gather -> Combine > Serial Aggregate on one node, then on some master node a Deserial > Combine Finalize Aggregate. You're very right that we can't craft > such a plan with today's master (We didn't even add a supporting enum > for it in AggSplit). However, it does appear that there are > extensions or forks out there which attempt to use the code in this > way, so it would be good to not leave those people out in the cold > regarding this. Greenplum plans split-aggregation quite similarly from Postgres: while it doesn't pass partial results through a intra-cluster "Gather" -- using a reshuffle-by-hash type operation instead -- Greenplum _does_ split an aggregate into final and partial halves, running them on different nodes. In short, the relation ship among the combine, serial, and deserial functions are similar to how they are in Postgres today (serial->deserial->combine), in the context of splitting aggregates. The current problem arises because Greenplum spills the hash table in hash aggregation (a diff we're working actively to upstream), a process in which we have to touch (read: serialize and copy) the internal trans values. However, we are definitely eyeing what you described as something to move towards. As a fork, we'd like to carry as thin a diff as possible. So the current situation is pretty much forcing us to diverge in the functions mentioned up-thread. In hindsight, "sloppy" might not have been a wise choice of words, apologies for the possible offense, David! > > For testing, can't we just have an Assert() in > advance_transition_function that verifies isnull matches the > nullability of the return value for INTERNAL returning transfns? i.e, > the attached > > I don't have a test case to hand that could cause this to fail, but it > sounds like Jesse might. One easy way to cause this is "sum(x) FILTER (WHERE false)" which will for sure make the partial results NULL. Is that what you're looking for? I'll be happy to send in the SQL. Cheers, Jesse
David Rowley <dgrowleyml@gmail.com> writes: > For testing, can't we just have an Assert() in > advance_transition_function that verifies isnull matches the > nullability of the return value for INTERNAL returning transfns? i.e, > the attached FTR, I do not like this Assert one bit. nodeAgg.c has NO business inquiring into the contents of internal-type Datums. It has even less business enforcing a particular Datum value for a SQL null --- we have always, throughout the system, considered that if isnull is true then the contents of the Datum are unspecified. I think this is much more likely to cause problems than solve any. regards, tom lane
On Wed, 15 Apr 2020 at 03:41, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > For testing, can't we just have an Assert() in > > advance_transition_function that verifies isnull matches the > > nullability of the return value for INTERNAL returning transfns? i.e, > > the attached > > FTR, I do not like this Assert one bit. nodeAgg.c has NO business > inquiring into the contents of internal-type Datums. It has even > less business enforcing a particular Datum value for a SQL null --- > we have always, throughout the system, considered that if isnull > is true then the contents of the Datum are unspecified. I think > this is much more likely to cause problems than solve any. OK. the latter case could be ignored by adding an OR condition to the Assert to allow isnull == false cases to pass without any consideration to the Datum value, but it sounds like you don't want to insist that isnull == true returns NULL a pointer. FWIW, I agree with Jesse that having numeric_combine() return a NULL pointer without properly setting the isnull flag is pretty bad and it should be fixed regardless. Not fixing it, even in the absence of having a good way to test it just seems like we're leaving something around that we're going to trip up on in the future. Serialization functions crashing after receiving input from a combine function seems pretty busted to me, regardless if there is a pathway for the functions to be called in that order in core or not. I'm not a fan of leaving it in just because testing for it might not be easy. One problem with coming up with a way of testing from an SQL level will be that we'll need to pick some aggregate functions that currently have this issue and ensure they don't regress. There's not much we can do to ensure any new aggregates we might create the future don't go and break this rule. That's why I thought that the Assert might be more useful. I don't think it would be impossible to test this using an extension and using the create_upper_paths_hook. I see that test_rls_hooks which runs during make check-world does hook into the RLS hooks do test some behaviour. I don't think it would be too tricky to have a hook implement a 3-stage aggregate plan with the middle stage doing a deserial/combine/serial before passing to the Finalize Aggregate node. That would allow us to ensure serial functions can accept the results from combine functions, to which nothing in core currently can do. David