Thread: Status of DISTINCT-by-hashing work

Status of DISTINCT-by-hashing work

From
Tom Lane
Date:
I've pretty much finished the project I got a bee in my bonnet about
last week, which is to teach SELECT DISTINCT how to (optionally) use
hashing for grouping in the same way that GROUP BY has been able to do
for awhile.

There are still two places in the system that hard-wire the use of
sorting for duplicate elimination:

* Set operations (UNION/INTERSECT/EXCEPT)

* Aggregate functions with DISTINCT

I'm thinking of trying to fix set operations before I leave this topic,
but I'm not sure it's worth the trouble to change DISTINCT aggregates.
They'd be a lot more work (since there's no executor infrastructure
in place that could be used) and the return on investment seems low.

Comments?
        regards, tom lane


Re: Status of DISTINCT-by-hashing work

From
"Asko Oja"
Date:
<div dir="ltr">Sounds very much like 80% 20% story. 80% that was easy to do is done and now 20% that is complex and
progressis slow is left to be done. Sounds very familiar from the comment in plan cache invalidation :)<br /><br /><div
class="gmail_quote">OnTue, Aug 5, 2008 at 5:51 PM, Tom Lane <span dir="ltr"><<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> I've pretty much
finishedthe project I got a bee in my bonnet about<br /> last week, which is to teach SELECT DISTINCT how to
(optionally)use<br /> hashing for grouping in the same way that GROUP BY has been able to do<br /> for awhile.<br /><br
/>There are still two places in the system that hard-wire the use of<br /> sorting for duplicate elimination:<br /><br
/>* Set operations (UNION/INTERSECT/EXCEPT)<br /><br /> * Aggregate functions with DISTINCT<br /><br /> I'm thinking of
tryingto fix set operations before I leave this topic,<br /> but I'm not sure it's worth the trouble to change DISTINCT
aggregates.<br/> They'd be a lot more work (since there's no executor infrastructure<br /> in place that could be used)
andthe return on investment seems low.<br /><br /> Comments?<br /><br />                        regards, tom lane<br
/><fontcolor="#888888"><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></font></blockquote></div><br /></div> 

Re: Status of DISTINCT-by-hashing work

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> I've pretty much finished the project I got a bee in my bonnet about
> last week, which is to teach SELECT DISTINCT how to (optionally) use
> hashing for grouping in the same way that GROUP BY has been able to do
> for awhile.
>
> There are still two places in the system that hard-wire the use of
> sorting for duplicate elimination:
>
> * Set operations (UNION/INTERSECT/EXCEPT)

Egads. Are you thinking to reimplement them more in line with the way other
nodes work? Or just have them choose between hashing and sorting themselves?

> * Aggregate functions with DISTINCT
>
> I'm thinking of trying to fix set operations before I leave this topic,
> but I'm not sure it's worth the trouble to change DISTINCT aggregates.
> They'd be a lot more work (since there's no executor infrastructure
> in place that could be used) and the return on investment seems low.
>
> Comments?

I recall being quite mystified by how distinct aggregates work when the sort
didn't appear anywhere in EXPLAIN output. If we could manage to expose that
info in the plan somehow it would be a great improvement even if we didn't
actually improve the plans available.

Any idea what would the needed executor infrastructure look like? Would it
have anything in common with the OLAP window functions infrastructure?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: Status of DISTINCT-by-hashing work

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> There are still two places in the system that hard-wire the use of
>> sorting for duplicate elimination:
>> 
>> * Set operations (UNION/INTERSECT/EXCEPT)

> Egads. Are you thinking to reimplement them more in line with the way other
> nodes work? Or just have them choose between hashing and sorting themselves?

Well, actually, after looking closer I'm realizing that it's harder than
I thought.  I had been thinking that we could just have the planner
choose whether to generate grouping instead of sorting nodes, but that
only works for plain UNION.  For INTERSECT/EXCEPT (with or without ALL),
you really need to maintain counters in each hashtable entry so you know
how many matching rows you got from each side of the set operation.
So it'd be necessary to either duplicate a large chunk of nodeAgg.c, or
make that code handle hashed INTERSECT/EXCEPT along with all its
existing duties.  Neither of which seems particularly appealing :-(.
I'm going to look at whether nodeAgg can be refactored to avoid this,
but I'm feeling a bit discouraged about it at the moment.

> I recall being quite mystified by how distinct aggregates work when the sort
> didn't appear anywhere in EXPLAIN output. If we could manage to expose that
> info in the plan somehow it would be a great improvement even if we didn't
> actually improve the plans available.

The problem is that each DISTINCT aggregate needs its own sort (or
hash), which doesn't seem to fit into our plan tree structure.

> Any idea what would the needed executor infrastructure look like? Would it
> have anything in common with the OLAP window functions infrastructure?

Possibly; I haven't paid much attention to the OLAP work yet.
        regards, tom lane


Re: Status of DISTINCT-by-hashing work

From
"Hitoshi Harada"
Date:
2008/8/6 Tom Lane <tgl@sss.pgh.pa.us>:
> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> There are still two places in the system that hard-wire the use of
>>> sorting for duplicate elimination:
>>>
>>> * Set operations (UNION/INTERSECT/EXCEPT)
>
>> Egads. Are you thinking to reimplement them more in line with the way other
>> nodes work? Or just have them choose between hashing and sorting themselves?
>
> Well, actually, after looking closer I'm realizing that it's harder than
> I thought.  I had been thinking that we could just have the planner
> choose whether to generate grouping instead of sorting nodes, but that
> only works for plain UNION.  For INTERSECT/EXCEPT (with or without ALL),
> you really need to maintain counters in each hashtable entry so you know
> how many matching rows you got from each side of the set operation.
> So it'd be necessary to either duplicate a large chunk of nodeAgg.c, or
> make that code handle hashed INTERSECT/EXCEPT along with all its
> existing duties.  Neither of which seems particularly appealing :-(.
> I'm going to look at whether nodeAgg can be refactored to avoid this,
> but I'm feeling a bit discouraged about it at the moment.

In working on window functions, I also found that nodeWindow.c
duplicates much of nodeAgg.c, which contains not only aggregates but
reading ahead until next group.

Additionally, not having implemented but planned, frame concept that
slides aggregates within a partition will require multiple saved
positions of tuplestore. Up to now Tuplestore has functionality to
mark/restore pos but it is only one chance, which means when you mark
a pos the previous pos cannot be restore anymore. The window frame
will need to do mark multiple times and to restore older ones.

>> Any idea what would the needed executor infrastructure look like? Would it
>> have anything in common with the OLAP window functions infrastructure?
>
> Possibly; I haven't paid much attention to the OLAP work yet.
>
>                        regards, tom lane

In my patch nodeWindow.c, some functions reach for its parent state
node to get info of sort keys by using fcinfo->context. This works but
is completely ugly. At least, window functions need ability to reach
for key or whole tuple of current or of offset (preceding/following)
row. If another feature like DISTINCT needs similar one, I am
encouraged to give more opinion.

Regards,

-- 
Hitoshi Harada


Re: Status of DISTINCT-by-hashing work

From
Hans-Juergen Schoenig
Date:
Tom Lane wrote:
> I've pretty much finished the project I got a bee in my bonnet about
> last week, which is to teach SELECT DISTINCT how to (optionally) use
> hashing for grouping in the same way that GROUP BY has been able to do
> for awhile.
>
> There are still two places in the system that hard-wire the use of
> sorting for duplicate elimination:
>
> * Set operations (UNION/INTERSECT/EXCEPT)
>
> * Aggregate functions with DISTINCT
>
> I'm thinking of trying to fix set operations before I leave this topic,
> but I'm not sure it's worth the trouble to change DISTINCT aggregates.
> They'd be a lot more work (since there's no executor infrastructure
> in place that could be used) and the return on investment seems low.
>
> Comments?
>
>             regards, tom lane
>
>   

i feel it exactly the same way.
DISTINCT has been a place people wanted to see fixed for a while but set 
operations are nothing I would really worry about.
what we have now is absolutely fine.

given the list of more important issues, i'd vote for something else.
   best regards,
      hans

-- 
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com



Re: Status of DISTINCT-by-hashing work

From
Tom Lane
Date:
"Hitoshi Harada" <umi.tanuki@gmail.com> writes:
> In my patch nodeWindow.c, some functions reach for its parent state
> node to get info of sort keys by using fcinfo->context. This works but
> is completely ugly.

Isn't that the same thing nodeAgg does: pass its AggState to aggregate
functions?  I don't see anything ugly about it --- at least, any
alternative you care to name is likely to be far worse.
        regards, tom lane


Re: Status of DISTINCT-by-hashing work

From
"Hitoshi Harada"
Date:
2008/8/6 Tom Lane <tgl@sss.pgh.pa.us>:
> "Hitoshi Harada" <umi.tanuki@gmail.com> writes:
>> In my patch nodeWindow.c, some functions reach for its parent state
>> node to get info of sort keys by using fcinfo->context. This works but
>> is completely ugly.
>
> Isn't that the same thing nodeAgg does: pass its AggState to aggregate
> functions?  I don't see anything ugly about it --- at least, any
> alternative you care to name is likely to be far worse.

Correct. I learned that method from nodeAgg. What I meant was that
window functions reach for plan node and its sort keys through
winstate->ss.ps. The mechanism is not ugly but some macros or exposed
API seem more comfortable, especially if thinking about formulating
window functions as user-defined functions.

Regards,


-- 
Hitoshi Harada


Re: Status of DISTINCT-by-hashing work

From
Tom Lane
Date:
"Hitoshi Harada" <umi.tanuki@gmail.com> writes:
> Correct. I learned that method from nodeAgg. What I meant was that
> window functions reach for plan node and its sort keys through
> winstate->ss.ps. The mechanism is not ugly but some macros or exposed
> API seem more comfortable, especially if thinking about formulating
> window functions as user-defined functions.

Agreed, I was about to suggest the same thing.  Have the window
functions pass the WindowState to some interface code that gets what
they need for them.
        regards, tom lane


Re: Status of DISTINCT-by-hashing work

From
Tom Lane
Date:
I wrote:
> ...  For INTERSECT/EXCEPT (with or without ALL),
> you really need to maintain counters in each hashtable entry so you know
> how many matching rows you got from each side of the set operation.
> So it'd be necessary to either duplicate a large chunk of nodeAgg.c, or
> make that code handle hashed INTERSECT/EXCEPT along with all its
> existing duties.  Neither of which seems particularly appealing :-(.
> I'm going to look at whether nodeAgg can be refactored to avoid this,
> but I'm feeling a bit discouraged about it at the moment.

Actually, it seems that most of what could be shared has already been
factored out into execGrouping.c.  I find that supporting hashing in
nodeSetOp.c will only roughly double its size (from 318 to 650 lines).
Although nodeAgg.c is about 1700 lines, most of its bulk comes from
managing the aggregate transition values and function calls.  There
might be some scope to save a few lines by refactoring, but it doesn't
look like it's worth the trouble.

The attached WIP patch compiles, but I've not tested it yet for lack
of planner support.  If some of the code looks suspiciously like
nodeAgg.c, it's because I started from nodeAgg and just stripped
everything that wasn't needed ...

If there are no objections, I'll push forward with persuading
the planner to support hashable set operations.

            regards, tom lane


Attachment