Thread: Query with limit goes from few ms to hours
On PG 9.1 and 9.2 I'm running the following query:
SELECT
*
FROM
stream_store
JOIN
(
SELECT
UNNEST(stream_store_ids) AS id
FROM
stream_store_version_index
WHERE
stream_id = 607106 AND
version = 11
) AS records USING (id)
ORDER BY
id DESC
This takes several (10 to 20) milliseconds at most.
When I add a LIMIT 1 to the end of the query, the query time goes to several hours(!).
The full version String of PG 9.1 is "PostgreSQL 9.1.5 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit". The 9.1 machine is a socket 771 dual quad core at 3.16Ghz with 64GB memory and 10 Intel x25M SSDs in a RAID5 setup on 2 ARECA 1680 RAID controllers. The "stream_store" table has 122 million rows and is partitioned. The array that's being unnested for the join has 27 entries.
Any idea?
Hi,
For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T
Kind regards
henk de wit <henk53602@hotmail.com> writes: > For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pastedthose on pastebin.com: http://pastebin.com/T5JTwh5T You need a less horrid estimate for the join size. Possibly an ANALYZE on the parent table (stream_store) would help. regards, tom lane
Hi Henk, On Sun, Oct 14, 2012 at 9:04 AM, henk de wit <henk53602@hotmail.com> wrote: > Hi, > > For some reason the mailinglist software seems to block the email as soon as > the planner details are in it, so I pasted those on pastebin.com: > http://pastebin.com/T5JTwh5T Just an additional data point: for whatever reason your email was placed in my GMail spam folder. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
Hi,
> henk de wit <henk53602@hotmail.com> writes:
> > For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T
>
> You need a less horrid estimate for the join size. Possibly an ANALYZE
> on the parent table (stream_store) would help.
> > For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T
>
> You need a less horrid estimate for the join size. Possibly an ANALYZE
> on the parent table (stream_store) would help.
Well, what do you know! That did work indeed. Immediately after the ANALYZE on that parent table (taking only a few seconds) a fast plan was created and the query executed in ms again. Silly me, I should have tried that earlier.
Thanks!
Kind regards
henk de wit wrote: > Well, what do you know! That did work indeed. Immediately after the > ANALYZE on that parent table (taking only a few seconds) a fast > plan was created and the query executed in ms again. Silly me, I > should have tried that earlier. Of course, if your autovacuum settings are aggressive enough, you should gernerally not need to run ANALYZE explicitly. You should double-check that autovacuum is turned on and configured at least as aggressively as the default settings, or you will probably get little surprises like this when you least expect them. -Kevin
On 10/23/2012 11:33 AM, Kevin Grittner wrote: > henk de wit wrote: > >> Well, what do you know! That did work indeed. Immediately after the >> ANALYZE on that parent table (taking only a few seconds) a fast >> plan was created and the query executed in ms again. Silly me, I >> should have tried that earlier. > Of course, if your autovacuum settings are aggressive enough, you > should gernerally not need to run ANALYZE explicitly. You should > double-check that autovacuum is turned on and configured at least as > aggressively as the default settings, or you will probably get little > surprises like this when you least expect them. > > The exception I'd make to Kevin's good advice is for cases when a process makes substantial statistics-altering changes to your data (bulk insert/delete/update) immediately followed by a query against the updated table(s). In those cases there is a good possibility that the statistics will not have been automatically updated before the subsequent query is planned so an explicit ANALYZE between the update and the query can be of value. Cheers, Steve