On Tue, 2013-05-07 at 18:32 +1200, Mark Kirkwood wrote:
> On 07/05/13 18:10, Simon Riggs wrote:
> > On 7 May 2013 01:23, <mark.kirkwood@catalyst.net.nz> wrote:
> >
> >> I'm thinking that a variant of (2) might be simpler to inplement:
> >>
> >> (I think Matt C essentially beat me to this suggestion - he originally
> >> discovered this issue). It is probably good enough for only *new* plans to
> >> react to the increased/increasing number of in progress rows. So this
> >> would require backends doing significant numbers of row changes to either
> >> directly update pg_statistic or report their in progress numbers to the
> >> stats collector. The key change here is the partial execution numbers
> >> would need to be sent. Clearly one would need to avoid doing this too
> >> often (!) - possibly only when number of changed rows >
> >> autovacuum_analyze_scale_factor proportion of the relation concerned or
> >> similar.
> >
> > Are you loading using COPY? Why not break down the load into chunks?
> >
>
> INSERT - but we could maybe workaround by chunking the INSERT. However
> that *really* breaks the idea that in SQL you just say what you want,
> not how the database engine should do it! And more practically means
> that the most obvious and clear way to add your new data has nasty side
> effects, and you have to tip toe around muttering secret incantations to
> make things work well :-)
We also had the same problem with an UPDATE altering the data
distribution in such a way that trivial but frequently executed queries
cause massive server load until auto analyze sorted out the stats.
--
Matt Clarkson
Catalyst.Net Limited