Thread: Query with limit goes from few ms to hours

Query with limit goes from few ms to hours

From
henk de wit
Date:
On PG 9.1 and 9.2 I'm running the following query:

SELECT 
    *
FROM 
    stream_store 
JOIN
    (
        SELECT 
            UNNEST(stream_store_ids) AS id
        FROM 
            stream_store_version_index 
        WHERE 
            stream_id = 607106 AND
            version = 11
    ) AS records USING (id)
ORDER BY 
    id DESC

This takes several (10 to 20) milliseconds at most.

When I add a LIMIT 1 to the end of the query, the query time goes to several hours(!).

The full version String of PG 9.1 is "PostgreSQL 9.1.5 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit". The 9.1 machine is a socket 771 dual quad core at 3.16Ghz with 64GB memory and 10 Intel x25M SSDs in a RAID5 setup on 2 ARECA 1680 RAID controllers. The "stream_store" table has 122 million rows and is partitioned. The array that's being unnested for the join has 27 entries.

Any idea?

Re: Query with limit goes from few ms to hours

From
henk de wit
Date:
Hi,

For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T

Kind regards

Re: Query with limit goes from few ms to hours

From
Tom Lane
Date:
henk de wit <henk53602@hotmail.com> writes:
> For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I
pastedthose on pastebin.com: http://pastebin.com/T5JTwh5T 

You need a less horrid estimate for the join size.  Possibly an ANALYZE
on the parent table (stream_store) would help.

            regards, tom lane


Re: Query with limit goes from few ms to hours

From
Robert Klemme
Date:
Hi Henk,

On Sun, Oct 14, 2012 at 9:04 AM, henk de wit <henk53602@hotmail.com> wrote:
> Hi,
>
> For some reason the mailinglist software seems to block the email as soon as
> the planner details are in it, so I pasted those on pastebin.com:
> http://pastebin.com/T5JTwh5T

Just an additional data point: for whatever reason your email was
placed in my GMail spam folder.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Re: Query with limit goes from few ms to hours

From
henk de wit
Date:
Hi,

> henk de wit <henk53602@hotmail.com> writes:
> > For some reason the mailinglist software seems to block the email as soon as the planner details are in it, so I pasted those on pastebin.com: http://pastebin.com/T5JTwh5T
>
> You need a less horrid estimate for the join size. Possibly an ANALYZE
> on the parent table (stream_store) would help.

Well, what do you know! That did work indeed. Immediately after the ANALYZE on that parent table (taking only a few seconds) a fast plan was created and the query executed in ms again. Silly me, I should have tried that earlier.

Thanks!

Kind regards

Re: Query with limit goes from few ms to hours

From
"Kevin Grittner"
Date:
henk de wit wrote:

> Well, what do you know! That did work indeed. Immediately after the
> ANALYZE on that parent table (taking only a few seconds) a fast
> plan was created and the query executed in ms again. Silly me, I
> should have tried that earlier.

Of course, if your autovacuum settings are aggressive enough, you
should gernerally not need to run ANALYZE explicitly. You should
double-check that autovacuum is turned on and configured at least as
aggressively as the default settings, or you will probably get little
surprises like this when you least expect them.

-Kevin


Re: Query with limit goes from few ms to hours

From
Steve Crawford
Date:
On 10/23/2012 11:33 AM, Kevin Grittner wrote:
> henk de wit wrote:
>
>> Well, what do you know! That did work indeed. Immediately after the
>> ANALYZE on that parent table (taking only a few seconds) a fast
>> plan was created and the query executed in ms again. Silly me, I
>> should have tried that earlier.
> Of course, if your autovacuum settings are aggressive enough, you
> should gernerally not need to run ANALYZE explicitly. You should
> double-check that autovacuum is turned on and configured at least as
> aggressively as the default settings, or you will probably get little
> surprises like this when you least expect them.
>
>
The exception I'd make to Kevin's good advice is for cases when a
process makes substantial statistics-altering changes to your data (bulk
insert/delete/update) immediately followed by a query against the
updated table(s). In those cases there is a good possibility that the
statistics will not have been automatically updated before the
subsequent query is planned so an explicit ANALYZE between the update
and the query can be of value.

Cheers,
Steve