Thread: Ordered Append WIP patch v1

Ordered Append WIP patch v1

From
Gregory Stark
Date:
Here's the WIP patch I described on -hackers to implemented "ordered" append
nodes.


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Attachment

Re: Ordered Append WIP patch v1

From
Heikki Linnakangas
Date:
Gregory Stark wrote:
> Here's the WIP patch I described on -hackers to implemented "ordered" append
> nodes.

I think it would be better to create completely separate executor node
for this, rather than tack this feature on the regular Append node. The
two don't seem to have much in common. Looking at AppendState, for
example, as_firstplan and as_lastplan are not interesting for the
MergeAppend, and likewise MergeAppend needs a whole bunch of fields not
interesting to regular Append. The fact that the first statement in
ExecAppend is "if(!node->as_is_ordered)", with zero lines of shared
codes between them, also suggests that it would be a good idea to keep
them separate.

As you point out in the comments, it's quite confusing to have indexes
into the slots array and the heap array. I still can't get my head
around that. Would it help to define a struct along the lines of:

struct {
   TupleTableSlot *slot;
   PlanState *plan;
};

and store pointers to those in the heap?

> 1) I still haven't completely figured out what to do with equivalence classes.
>    My hack of just stuffing all the append subrel vars into there seems to
>    work fine. I need to understand what's going on to see if there's really a
>    problem with it or not.

I don't understand that well enough to comment... I presume the FIXME in
the patch is related to this?

> 4) I haven't handled mark/restore or random access. I think they could be
>    handled and they'll probably be worth the complexity but I'm not sure.

For Mark/restore, I suppose you would mark/restore all subnodes, and
store/restore the heap. For reversing direction, you would reverse the
heap. Doesn't sound too hard, but it's probably better to make it work
without them at first before adding any bells and whistles...

> 5) Is it considering too many paths? Are there some which maybe aren't worth
>    considering? For example, maybe it only makes sense to take best start_cost
>    paths since if that plan doesn't dominate then the best total_cost plan is
>    likely to be the sequential scans + unordered append + sort.

I don't understand this paragraph. Are you referring to this comment in
the patch:

> /* If we can't find any plans with the right order just add the
>  * cheapest total plan to both paths, we'll have to sort it
>  * anyways. Perhaps consider not generating a startup path for such
>  * pathkeys since it'll be unlikely to be the cheapest startup
>  * cost. */

Having to sort one or two of the sub-plans might not be to be expensive
at all. For example, having to sort the empty parent relation of a
partitioned table is essentially free. It's probably never cheaper to
use the Merge Append if you need to sort *all* of the children, but if
you can avoid sorting even one of them, it might very well be worthwhile.

> 6) I haven't looked at setops yet but this is particularly attractive for the
>    UNION (as opposed to UNION ALL) case.

Yes. And DISTINCT.

> 7) I copied/adapted a bunch of bits from tuplesort to maintain the heap and do
>    the scankey comparisons. I could refactor that code back into tuplesort but
>    it would mean twisting around tuplesort quite a bit. Essentially it would
>    mean introducing a new type of tuplesort which would start off in
>    FINAL_MERGE state only it would have to behave differently since we don't
>    want it prefetching lots of records like FINAL_MERGE does, I don't think.

Doesn't seem worthwhile. The heap management part of the patch is quite
short.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Ordered Append WIP patch v1

From
Alvaro Herrera
Date:
Gregory Stark wrote:
>
> Here's the WIP patch I described on -hackers to implemented "ordered" append
> nodes.

Did you ever publish an updated version of this patch?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Ordered Append WIP patch v1

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Gregory Stark wrote:
> >
> > Here's the WIP patch I described on -hackers to implemented "ordered" append
> > nodes.
>
> Did you ever publish an updated version of this patch?

I don't think so.  I think we just need to tell Greg if he should
continue in this direction.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Ordered Append WIP patch v1

From
Gregory Stark
Date:
"Bruce Momjian" <bruce@momjian.us> writes:

> Alvaro Herrera wrote:
>> Gregory Stark wrote:
>> >
>> > Here's the WIP patch I described on -hackers to implemented "ordered" append
>> > nodes.
>>
>> Did you ever publish an updated version of this patch?

No, it's been kind of on the back burner.

> I don't think so.  I think we just need to tell Greg if he should
> continue in this direction.

I think the executor side of things is pretty straightforward. Where I'm
really uncertain is on the planner side of things.

To be honest I didn't follow at all what Tom was saying to do with the
equivalence classes. What it's doing now is basically just lying and saying
the child columns are equivalent to the parent columns -- I'm not sure what
the consequences of that are. Tom seemed to think that would be bad but I
don't see any real problems with it.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!