Re: subselect requires offset 0 for good performance. - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: subselect requires offset 0 for good performance.
Date
Msg-id CAOR=d=1noLyUHs9SZ6VqABK3VF6ECErbeLGJe0WfWZ_zDe7LKg@mail.gmail.com
Whole thread Raw
In response to Re: subselect requires offset 0 for good performance.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: subselect requires offset 0 for good performance.  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On Fri, Aug 2, 2013 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Scott Marlowe <scott.marlowe@gmail.com> writes:
>> I extracted all the data like so:
>> select * into dba.pp_test_wide from original table;
>> and get this query plan from explain analyze:
>> http://explain.depesz.com/s/EPx which takes 20 minutes to run.
>> If I extract it this way:
>> select tree_sortkey, product_name, deleted_at into db.pp_test_3col
>> from original table;
>> I get this plan: http://explain.depesz.com/s/gru which gets a
>> materialize in it, and suddenly takes 106 ms.
>
> There's no reason why suppressing some unrelated columns would change the
> rowcount estimates, but those two plans show different rowcount estimates.
>
> I suspect the *actual* reason for the plan change was that autovacuum had
> had a chance to update statistics for the one table, and not yet for the
> other.  Please do a manual ANALYZE on both tables and see if there's
> still a plan difference.

Interesting. I ran analyze on both tables and sure enough the new test
table runs fast. Ran analyze on the old table and it runs slow. The
only thing the old table and its plan are missing is the materialize.
So what is likely to change from the old table to the new one? Here's
the explain analyze output from the old table and the same query
against it: http://explain.depesz.com/s/CtZ and here's the plan with
offset 0 in it: http://explain.depesz.com/s/Gug note that while the
estimates are a bit off, the really huge difference here says to me
some suboptimal method is getting deployed in the background
somewhere. Do we need a stack trace?


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Looks like merge join planning time is too big, 55 seconds
Next
From: Scott Marlowe
Date:
Subject: Re: subselect requires offset 0 for good performance.