Re: [PATCH] Equivalence Class Filters - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: [PATCH] Equivalence Class Filters |
Date | |
Msg-id | CAKJS1f8RJLCM44+WZP_YdOrg3v2kusaMbb-Bn8AFYuk-9oePOg@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Equivalence Class Filters (Jim Nasby <Jim.Nasby@BlueTreble.com>) |
Responses |
Re: [PATCH] Equivalence Class Filters
Re: [PATCH] Equivalence Class Filters |
List | pgsql-hackers |
On 8 December 2015 at 04:35, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 12/6/15 10:38 AM, Tom Lane wrote:I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.
This is a problem that seriously hurts Postgres in data warehousing applications. We can't keep ignoring optimizations that provide even as little as 10% execution improvements for 10x worse planner performance, because in a warehouse it's next to impossible for planning time to matter.
Obviously it'd be great if there was a fast, easy way to figure out whether a query would be expensive enough to go the whole 9 yards on planning it but at this point I suspect a simple GUC would be a big improvement.
I've certainly been here before [1], but the idea fell of deaf ears.
The biggest frustration for me is that the way Tom always seems to argue his point it's as if planning time is roughly the same or more expensive than execution time, and likely that's true in many cases, but I would imagine in more normal cases that execution time is longer, although I've never had the guts to stand up and argued this as I don't have any stats to back me up.
I was talking to Thomas Munro yesterday about this, and he mentioned that it would be quite nice to have some stats on how much time is spent in the planner, vs executor. He came up with the idea of adding a column to pg_stat_statements to record the planning time.
If we could get real statistics on real world cases and we discovered that, for example on average that the total CPU time of planning was only 1% of execution time, then it would certainly make adding 2% overhead onto the planner a bit less of a worry, as this would just be %2 of 1% (0.02%). Such information, if fed back into the community might be able to drive us in the right direction when it comes to deciding what needs to be done to resolve this constant issue with accepting planner improvement patches.
I believe that with parallel query on the horizon for 9.6 that we're now aiming to support bigger OLAP type database than ever before. So if we ignore patches like this one then it appears that we have some conflicting goals in the community as it seems that we're willing to add the brawn, but we're not willing to add the brain. If this is the case then it's a shame, as I think we can have both. So I very much agree on the fact that we must find a way to maintain support and high performance of small OLTP databases too.
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: