Re: Problem with update on partitioned table - Mailing list pgsql-general

From Martin Gainty
Subject Re: Problem with update on partitioned table
Date
Msg-id BLU136-DAV79FF310AC13FFC96FA2FDAEFD0@phx.gbl
Whole thread Raw
In response to Problem with update on partitioned table  (Alex Solovey <a.solovey@gmail.com>)
Responses Re: Problem with update on partitioned table  (Alex Solovey <a.solovey@gmail.com>)
List pgsql-general
Alex-

http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS
TRAINT-EXCLUSION
postgresql.conf contains a constraint_exclusion parameter called
constraint_exclusion (boolean)
which if you dont want to scan ALL partitions must be set to 'on'
constraint_exclusion = on

HTH
Martin
----- Original Message -----
From: "Alex Solovey" <a.solovey@gmail.com>
To: <pgsql-general@postgresql.org>
Sent: Monday, March 24, 2008 2:15 PM
Subject: [GENERAL] Problem with update on partitioned table


> Hello,
>
> We have pretty big production database (running PostgreSQL 8.3.1) with
> many partitioned tables. In most cases, they work well (since 8.2.1 at
> least) -- constraint exclusion is able to select correct partitions.
> However, there is an exception: queries on partitioned tables using
> PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.
>
> Here is a simple test case:
> ------------------
> CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY );
> INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 );
>
> CREATE TABLE foo (
>       part     INT NOT NULL
>      ,foo_data INT
>      ,bar_id   INT NOT NULL REFERENCES bar( bar_id )
> );
>
> CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo );
> INSERT INTO  foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 );
>
> CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo );
> INSERT INTO  foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 );
>
> CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo );
> INSERT INTO  foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 );
> ------------------
>
> As you can see, table "Foo" is partitioned by column "part". If only
> "Foo" is referenced in update, query plan is fine:
>
> => EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2;
>                              QUERY PLAN
> ------------------------------------------------------------------
>   Append  (cost=0.00..68.50 rows=20 width=14)
>     ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
>           Filter: (part = 2)
>     ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
>           Filter: (part = 2)
> (5 rows)
>
> However, for this query it is far from being optimal:
>
> => EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND
> foo.bar_id = bar.bar_id;
>                                     QUERY PLAN
>
> --------------------------------------------------------------------------
------
>   Append  (cost=0.00..nan rows=22 width=14)
>     ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
>           ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
>                 Filter: (part = 2)
>           ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
> width=4)
>                 Index Cond: (bar.bar_id = public.foo.bar_id)
>     ->  Merge Join  (cost=nan..nan rows=1 width=8)
>           Merge Cond: (public.foo.bar_id = bar.bar_id)
>           ->  Sort  (cost=0.02..0.03 rows=1 width=0)
>                 Sort Key: public.foo.bar_id
>                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                       One-Time Filter: false
>           ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
>                 Sort Key: bar.bar_id
>                 ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
>     ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
>           ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
>                 Filter: (part = 2)
>           ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
> width=4)
>                 Index Cond: (bar.bar_id = public.foo.bar_id)
>     ->  Merge Join  (cost=nan..nan rows=1 width=8)
>           Merge Cond: (public.foo.bar_id = bar.bar_id)
>           ->  Sort  (cost=0.02..0.03 rows=1 width=0)
>                 Sort Key: public.foo.bar_id
>                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                       One-Time Filter: false
>           ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
>                 Sort Key: bar.bar_id
>                 ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
> (29 rows)
>
> Is there any way to avoid this anomaly? UPDATE ... FROM is very
> convenient if you have to update rows depending on conditions involving
> multiple tables. In addition, with partitioned tables,
> standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo,
> bar...) is even worse because query  planner cannot choose correct
> partitions without nested select results and so it does a complete scan
> of all partitions instead.
>
> Alex
>
> -
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


pgsql-general by date:

Previous
From: Erik Jones
Date:
Subject: Re: Checking if Aggregate exists
Next
From: Tom Lane
Date:
Subject: Re: Problem with update on partitioned table