Re: [HACKERS] Slow count(*) again... - Mailing list pgsql-performance

From david@lang.hm
Subject Re: [HACKERS] Slow count(*) again...
Date
Msg-id alpine.DEB.2.00.1102030208440.8162@asgard.lang.hm
Whole thread Raw
In response to Re: [HACKERS] Slow count(*) again...  (Vitalii Tymchyshyn <tivv00@gmail.com>)
Responses Re: [HACKERS] Slow count(*) again...  (Kenneth Marshall <ktm@rice.edu>)
Re: [HACKERS] Slow count(*) again...  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On Thu, 3 Feb 2011, Vitalii Tymchyshyn wrote:

> 02.02.11 20:32, Robert Haas ???????(??):
>>
>> Yeah.  Any kind of bulk load into an empty table can be a problem,
>> even if it's not temporary.  When you load a bunch of data and then
>> immediately plan a query against it, autoanalyze hasn't had a chance
>> to do its thing yet, so sometimes you get a lousy plan.
>
> May be introducing something like 'AutoAnalyze' threshold will help? I mean
> that any insert/update/delete statement that changes more then x% of table
> (and no less then y records) must do analyze right after it was finished.
> Defaults like x=50 y=10000 should be quite good as for me.

If I am understanding things correctly, a full Analyze is going over all
the data in the table to figure out patterns.

If this is the case, wouldn't it make sense in the situation where you are
loading an entire table from scratch to run the Analyze as you are
processing the data? If you don't want to slow down the main thread that's
inserting the data, you could copy the data to a second thread and do the
analysis while it's still in RAM rather than having to read it off of disk
afterwords.

this doesn't make sense for updates to existing databases, but the use
case of loading a bunch of data and then querying it right away isn't
_that_ uncommon.

David Lang

pgsql-performance by date:

Previous
From: Glyn Astill
Date:
Subject: Re: Which RAID Controllers to pick/avoid?
Next
From: Laszlo Nagy
Date:
Subject: Get master-detail relationship metadata