Re: WITHIN GROUP patch - Mailing list pgsql-hackers

From Tom Lane
Subject Re: WITHIN GROUP patch
Date
Msg-id 2945.1386123266@sss.pgh.pa.us
Whole thread Raw
In response to Re: WITHIN GROUP patch  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: WITHIN GROUP patch
List pgsql-hackers
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> 1. I really hate the way you've overloaded the transvalue to do
>  Tom> something that has approximately nothing to do with transition
>  Tom> state (and haven't updated catalogs.sgml to explain that,
>  Tom> either).  Seems like it'd be cleaner to just hardwire a bool
>  Tom> column that distinguishes regular and hypothetical input rows.

> The intention here is that while the provided functions all fit the
> spec's idea of how inverse distribution or hypothetical set functions
> work, the actual implementation mechanisms are more generally
> applicable than that and a user-supplied function could well find
> something else useful to do with them. Accordingly, hardcoding stuff
> seemed inappropriate.

>  Tom> And why do you need aggtranssortop for that?  I fail to see the
>  Tom> point of sorting on the flag column.

> It is convenient to the implementation to be able to rely on
> encountering the hypothetical row deterministically before (or in some
> cases after, as in cume_dist) its peers in the remainder of the sort
> order. Removing that sort would make the results of the functions
> incorrect.

Well, okay, but you've not said anything that wouldn't be handled just as
well by some logic that adds a fixed integer-constant-zero flag column to
the rows going into the tuplesort.  The function-specific code could then
inject hypothetical row(s) with other values, eg 1 or -1, to get the
results you describe.  If there's any flexibility improvement from
allowing a different constant value than zero, or a different datatype
than integer, I don't see what it'd be.

On the other side of the coin, repurposing the transition-value catalog
infrastructure to do this totally different thing is confusing.  And what
if someday somebody has a use for a regular transition value along with
this stuff?  What you've done is unorthogonal for no very good reason.

(Actually, I'm wondering whether it wouldn't be better if the tuplesort
object *were* the transition value, and were managed primarily by the
aggregate function's code; in particular expecting the agg's transition
function to insert rows into the tuplesort object.  We could provide
helper functions to largely negate any duplication-of-code objections,
I'd think.  In this view the WITHIN GROUP columns aren't so different from
regular aggregate arguments, but there'd need to be some way for the
aggregate function to get hold of the sorting-semantics decoration on
them.)

>  Tom> 2 I also don't care for the definition of aggordnargs, which is
>  Tom> the number of direct arguments to an ordered set function,
>  Tom> except when it isn't.  Rather than overloading it to be both a
>  Tom> count and a couple of flags, I wonder whether we shouldn't
>  Tom> expand aggisordsetfunc to be a three-way "aggregate kind" field
>  Tom> (ordinary agg, ordered set, or hypothetical set function).

> It would still be overloaded in some sense because a non-hypothetical
> ordered set function could still take an arbitrary number of args
> (using variadic "any") - there aren't any provided, but there's no
> good reason to disallow user-defined functions doing that - so you'd
> still need a special value like -1 for aggordnargs to handle that.

Sure.  But a -1 to indicate "not applicable" doesn't seem like it's
too much of a stretch.  It's the -2 business that's bothering me.
Again, that seems unnecessarily non-orthogonal --- who's to say which
functions would want to constrain the number of direct arguments and
which wouldn't?  (I wonder whether having this info in the catalogs
isn't the wrong thing anyhow, as opposed to expecting the functions
themselves to check the argument count at runtime.)
        regards, tom lane



pgsql-hackers by date:

Previous
From: KONDO Mitsumasa
Date:
Subject: Re: Time-Delayed Standbys
Next
From: KONDO Mitsumasa
Date:
Subject: Re: Time-Delayed Standbys