Re: In progress INSERT wrecks plans on table - Mailing list pgsql-performance

From mark.kirkwood@catalyst.net.nz
Subject Re: In progress INSERT wrecks plans on table
Date
Msg-id 916bff05a5ab130c76d7835abbaf7762.squirrel@mail.catalyst.net.nz
Whole thread Raw
In response to Re: In progress INSERT wrecks plans on table  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: In progress INSERT wrecks plans on table  (Thomas Kellerer <spam_eater@gmx.net>)
Re: In progress INSERT wrecks plans on table  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-performance
> On 2 May 2013 01:49, Mark Kirkwood <mark.kirkwood@catalyst.net.nz> wrote:
>
> I think we need a problem statement before we attempt a solution,
> which is what Tom is alluding to.
>

Actually no - I think Tom (quite correctly) was saying that the patch was
not a viable solution. With which I agree.

I believe the title of this thread is the problem statement.

> ISTM that you've got a case where the plan is very sensitive to a
> table load. Which is a pretty common situation and one that can be
> solved in various ways. I don't see much that Postgres can do because
> it can't know ahead of time you're about to load rows. We could
> imagine an optimizer that set thresholds on plans that caused the
> whole plan to be recalculated half way thru a run, but that would be a
> lot of work to design and implement and even harder to test. Having
> static plans at least allows us to discuss what it does after the fact
> with some ease.
>
> The plan is set using stats that are set when there are very few
> non-NULL rows, and those increase massively on load. The way to cope
> is to run the ANALYZE immediately after the load and then don't allow
> auto-ANALYZE to reset them later.

No. We do run analyze immediately after the load. The surprise was that
this was not sufficient - the (small) amount of time where non optimal
plans were being used due to the in progress row activity was enough to
cripple the system - that is the problem. The analysis of why not led to
the test case included in the original email. And sure it is deliberately
crafted to display the issue, and is therefore open to criticism for being
artificial. However it was purely meant to make it easy to see what I was
talking about.


Currently we are working around this by coercing one of the predicates in
the query to discourage the attractive looking but dangerous index.

I think the idea of telling postgres that we are doing a load is probably
the wrong way to go about this. We have a framework that tries to
automatically figure out the best plans...I think some more thought about
how to make that understand some of the more subtle triggers for a
time-to-do-new-plans moment is the way to go. I understand this is
probably hard - and may imply some radical surgery to how the stats
collector and planner interact.

Regards

Mark




pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: "WHERE 1 = 2 OR ..." makes planner choose a very inefficient plan
Next
From: Thomas Kellerer
Date:
Subject: Re: In progress INSERT wrecks plans on table