Re: Bad query plans for queries on partitioned table - Mailing list pgsql-performance

From Julian Mehnle
Subject Re: Bad query plans for queries on partitioned table
Date
Msg-id 200712042027.01921.julian@mehnle.net
Whole thread Raw
In response to Re: Bad query plans for queries on partitioned table  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Bad query plans for queries on partitioned table
List pgsql-performance
Gregory Stark wrote:
> There are a few things going on here.
>
> 1) The optimizer can't build a plan which ignores those partitions
> because the statistics are just approximations. You could insert into
> one of them at any time and the statistics won't update immediately. If
> you have a partition which is empty of some type of data you can put a
> constraint on it to promise the optimizer that that condition will stay
> true.

I actually do have constraints on all the partitions, e.g. for week 34:

  Check constraints [for email_2007_week34]:
    "email_2007_week34_ts_check" CHECK (ts >= '2007-08-20 00:00:00'::timestamp without time zone AND ts < '2007-08-27
00:00:00'::timestampwithout time zone) 

  Check constraints [for email_2007_week34_extras]:
    "email_2007_week34_extras_ts_check" CHECK (ts >= '2007-08-20 00:00:00'::timestamp without time zone AND ts <
'2007-08-2700:00:00'::timestamp without time zone) 

Shouldn't this be enough to give the query planner a clue that it only
has to join the "email" and "email_extras" tables' partitions pair-wise,
as opposed to cross-joining them?

For the record, I also have indexes:

  Indexes [for email_2007_week34]:
    "email_2007_week34_pkey" PRIMARY KEY, btree (id)
    "index_email_2007_week34_on_helo" btree (helo)
    "index_email_2007_week34_on_ip" btree (ip)
    "index_email_2007_week34_on_load_id" btree (load_id)
    "index_email_2007_week34_on_mfrom" btree (mfrom)
    "index_email_2007_week34_on_ts" btree (ts)

  Indexes [for for email_2007_week34_extras]:
    "email_2007_week34_extras_pkey" PRIMARY KEY, btree (id)

> 2) The optimizer is assuming that empty tables have a default 1,000
> records in them with no idea about their statistics. Otherwise you get
> terrible plans on tables which have just been created or never
> analyzed. In this case that's causing it to think there will be tons of
> matches on what is apparently a very selective criterion.

I see.  But this shouldn't matter under the assumption that constraint
exclusion works correctly, right?

> 3) The optimizer is a bit dumb about partitioned tables. But I'm not
> sure if that's actually the fault here.
>
> Try adding one record of data to each of those partitions or putting a
> constraint on them which will allow constraint_exclusion (I assume you
> have that enabled?) to kick in. You'll still be bitten by the parent
> table but hopefully that's not enough to cause a problem.

The parent table is empty.  How will adding one record to each of the
partitions make a difference given the above constraints?

Julian.

Attachment

pgsql-performance by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Bad query plans for queries on partitioned table
Next
From: Mark Mielke
Date:
Subject: Re: RAID arrays and performance