Thread: Problem with update on partitioned table

Problem with update on partitioned table

From

Alex Solovey

Date:

24 March 2008, 16:17:20

Hello,

We have pretty big production database (running PostgreSQL 8.3.1) with
many partitioned tables. In most cases, they work well (since 8.2.1 at
least) -- constraint exclusion is able to select correct partitions.
However, there is an exception: queries on partitioned tables using
PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.

Here is a simple test case:
------------------
CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY );
INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 );

CREATE TABLE foo (
      part     INT NOT NULL
     ,foo_data INT
     ,bar_id   INT NOT NULL REFERENCES bar( bar_id )
);

CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo );
INSERT INTO  foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 );

CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo );
INSERT INTO  foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 );

CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo );
INSERT INTO  foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 );
------------------

As you can see, table "Foo" is partitioned by column "part". If only
"Foo" is referenced in update, query plan is fine:

=> EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2;
                             QUERY PLAN
------------------------------------------------------------------
  Append  (cost=0.00..68.50 rows=20 width=14)
    ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
          Filter: (part = 2)
    ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
          Filter: (part = 2)
(5 rows)

However, for this query it is far from being optimal:

=> EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND
foo.bar_id = bar.bar_id;
                                    QUERY PLAN

--------------------------------------------------------------------------------
  Append  (cost=0.00..nan rows=22 width=14)
    ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
          ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
                Filter: (part = 2)
          ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
width=4)
                Index Cond: (bar.bar_id = public.foo.bar_id)
    ->  Merge Join  (cost=nan..nan rows=1 width=8)
          Merge Cond: (public.foo.bar_id = bar.bar_id)
          ->  Sort  (cost=0.02..0.03 rows=1 width=0)
                Sort Key: public.foo.bar_id
                ->  Result  (cost=0.00..0.01 rows=1 width=0)
                      One-Time Filter: false
          ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
                Sort Key: bar.bar_id
                ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
    ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
          ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
                Filter: (part = 2)
          ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
width=4)
                Index Cond: (bar.bar_id = public.foo.bar_id)
    ->  Merge Join  (cost=nan..nan rows=1 width=8)
          Merge Cond: (public.foo.bar_id = bar.bar_id)
          ->  Sort  (cost=0.02..0.03 rows=1 width=0)
                Sort Key: public.foo.bar_id
                ->  Result  (cost=0.00..0.01 rows=1 width=0)
                      One-Time Filter: false
          ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
                Sort Key: bar.bar_id
                ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
(29 rows)

Is there any way to avoid this anomaly? UPDATE ... FROM is very
convenient if you have to update rows depending on conditions involving
multiple tables. In addition, with partitioned tables,
standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo,
bar...) is even worse because query  planner cannot choose correct
partitions without nested select results and so it does a complete scan
of all partitions instead.

Alex

Re: Problem with update on partitioned table

From

"Martin Gainty"

Date:

24 March 2008, 17:12:27

Alex-

http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS
TRAINT-EXCLUSION
postgresql.conf contains a constraint_exclusion parameter called
constraint_exclusion (boolean)
which if you dont want to scan ALL partitions must be set to 'on'
constraint_exclusion = on

HTH
Martin
----- Original Message -----
From: "Alex Solovey" <a.solovey@gmail.com>
To: <pgsql-general@postgresql.org>
Sent: Monday, March 24, 2008 2:15 PM
Subject: [GENERAL] Problem with update on partitioned table


> Hello,
>
> We have pretty big production database (running PostgreSQL 8.3.1) with
> many partitioned tables. In most cases, they work well (since 8.2.1 at
> least) -- constraint exclusion is able to select correct partitions.
> However, there is an exception: queries on partitioned tables using
> PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.
>
> Here is a simple test case:
> ------------------
> CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY );
> INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 );
>
> CREATE TABLE foo (
>       part     INT NOT NULL
>      ,foo_data INT
>      ,bar_id   INT NOT NULL REFERENCES bar( bar_id )
> );
>
> CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo );
> INSERT INTO  foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 );
>
> CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo );
> INSERT INTO  foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 );
>
> CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo );
> INSERT INTO  foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 );
> ------------------
>
> As you can see, table "Foo" is partitioned by column "part". If only
> "Foo" is referenced in update, query plan is fine:
>
> => EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2;
>                              QUERY PLAN
> ------------------------------------------------------------------
>   Append  (cost=0.00..68.50 rows=20 width=14)
>     ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
>           Filter: (part = 2)
>     ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
>           Filter: (part = 2)
> (5 rows)
>
> However, for this query it is far from being optimal:
>
> => EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND
> foo.bar_id = bar.bar_id;
>                                     QUERY PLAN
>
> --------------------------------------------------------------------------
------
>   Append  (cost=0.00..nan rows=22 width=14)
>     ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
>           ->  Seq Scan on foo  (cost=0.00..34.25 rows=10 width=14)
>                 Filter: (part = 2)
>           ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
> width=4)
>                 Index Cond: (bar.bar_id = public.foo.bar_id)
>     ->  Merge Join  (cost=nan..nan rows=1 width=8)
>           Merge Cond: (public.foo.bar_id = bar.bar_id)
>           ->  Sort  (cost=0.02..0.03 rows=1 width=0)
>                 Sort Key: public.foo.bar_id
>                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                       One-Time Filter: false
>           ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
>                 Sort Key: bar.bar_id
>                 ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
>     ->  Nested Loop  (cost=0.00..73.05 rows=10 width=14)
>           ->  Seq Scan on foo_2 foo  (cost=0.00..34.25 rows=10 width=14)
>                 Filter: (part = 2)
>           ->  Index Scan using bar_pkey on bar  (cost=0.00..3.87 rows=1
> width=4)
>                 Index Cond: (bar.bar_id = public.foo.bar_id)
>     ->  Merge Join  (cost=nan..nan rows=1 width=8)
>           Merge Cond: (public.foo.bar_id = bar.bar_id)
>           ->  Sort  (cost=0.02..0.03 rows=1 width=0)
>                 Sort Key: public.foo.bar_id
>                 ->  Result  (cost=0.00..0.01 rows=1 width=0)
>                       One-Time Filter: false
>           ->  Sort  (cost=168.75..174.75 rows=2400 width=4)
>                 Sort Key: bar.bar_id
>                 ->  Seq Scan on bar  (cost=0.00..34.00 rows=2400 width=4)
> (29 rows)
>
> Is there any way to avoid this anomaly? UPDATE ... FROM is very
> convenient if you have to update rows depending on conditions involving
> multiple tables. In addition, with partitioned tables,
> standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo,
> bar...) is even worse because query  planner cannot choose correct
> partitions without nested select results and so it does a complete scan
> of all partitions instead.
>
> Alex
>
> -
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Problem with update on partitioned table

From

Tom Lane

Date:

24 March 2008, 17:56:49

Alex Solovey <a.solovey@gmail.com> writes:
> We have pretty big production database (running PostgreSQL 8.3.1) with
> many partitioned tables. In most cases, they work well (since 8.2.1 at
> least) -- constraint exclusion is able to select correct partitions.
> However, there is an exception: queries on partitioned tables using
> PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension.

Hmm, the immediate problem is that cost_mergejoin is coming out with
a silly cost (NaN) because of division by zero.  The attached patch
should get it back to 8.2-equivalent behavior.  But really we're missing
a bet because the sub-joins ought to get discarded entirely when we know
they must be empty.  There are various places testing for this but it
looks like make_join_rel() needs to do it too.

            regards, tom lane


Index: src/backend/optimizer/path/costsize.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v
retrieving revision 1.191
diff -c -r1.191 costsize.c
*** src/backend/optimizer/path/costsize.c    1 Jan 2008 19:45:50 -0000    1.191
--- src/backend/optimizer/path/costsize.c    24 Mar 2008 20:55:42 -0000
***************
*** 1385,1390 ****
--- 1385,1396 ----
      Selectivity joininfactor;
      Path        sort_path;        /* dummy for result of cost_sort */

+     /* Protect some assumptions below that rowcounts aren't zero */
+     if (outer_path_rows <= 0)
+         outer_path_rows = 1;
+     if (inner_path_rows <= 0)
+         inner_path_rows = 1;
+
      if (!enable_mergejoin)
          startup_cost += disable_cost;

Re: Problem with update on partitioned table

From

Alex Solovey

Date:

24 March 2008, 18:17:58

Martin,

 > which if you dont want to scan ALL partitions must be set to 'on'
 > constraint_exclusion = on

It is 'ON'. The problem is that it does not work well for 'UPDATE foo
... FROM bar' queries, when partitioned table 'foo' is joined with
another table.

Martin Gainty wrote:
> Alex-
>
> http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS
> TRAINT-EXCLUSION
> postgresql.conf contains a constraint_exclusion parameter called
> constraint_exclusion (boolean)
> which if you dont want to scan ALL partitions must be set to 'on'
> constraint_exclusion = on
>
> HTH
> Martin

Re: Problem with update on partitioned table

From

Alex Solovey

Date:

24 March 2008, 19:02:34

Tom,

Thanks for the patch. We've tried it here, and it improved query plan
slightly (indeed, it looks exactly like the plan from 8.2.6 now).
But, as you've said, empty sub-joins are still not discarded, so query
execution time did not improve. And this is the same in both 8.2 and 8.3.

Note that only 'UPDATE FROM' does not discard sub-joins. 'SELECT' on the
same tables is just fine:

=> EXPLAIN SELECT * FROM foo, bar WHERE part = 2 AND foo.bar_id =
bar.bar_id;
                                  QUERY PLAN

-----------------------------------------------------------------------------
  Hash Join  (cost=64.47..104.08 rows=18 width=16)
    Hash Cond: (bar.bar_id = public.foo.bar_id)
    ->  Seq Scan on bar  (cost=0.00..31.40 rows=2140 width=4)
    ->  Hash  (cost=64.25..64.25 rows=18 width=12)
          ->  Append  (cost=0.00..64.25 rows=18 width=12)
                ->  Seq Scan on foo  (cost=0.00..32.12 rows=9 width=12)
                      Filter: (part = 2)
                ->  Seq Scan on foo_2 foo  (cost=0.00..32.12 rows=9
width=12)
                      Filter: (part = 2)
(9 rows)

-- Alex

Tom Lane wrote:
> Hmm, the immediate problem is that cost_mergejoin is coming out with
> a silly cost (NaN) because of division by zero.  The attached patch
> should get it back to 8.2-equivalent behavior.  But really we're missing
> a bet because the sub-joins ought to get discarded entirely when we know
> they must be empty.  There are various places testing for this but it
> looks like make_join_rel() needs to do it too.
>
>             regards, tom lane

Re: Problem with update on partitioned table

From

Tom Lane

Date:

24 March 2008, 19:05:13

Alex Solovey <a.solovey@gmail.com> writes:
> Thanks for the patch. We've tried it here, and it improved query plan
> slightly (indeed, it looks exactly like the plan from 8.2.6 now).
> But, as you've said, empty sub-joins are still not discarded, so query
> execution time did not improve. And this is the same in both 8.2 and 8.3.

If you're feeling brave, try the patch I just committed to CVS.

            regards, tom lane

Re: Problem with update on partitioned table

From

Alex Solovey

Date:

24 March 2008, 19:38:27

Tom,

 > If you're feeling brave, try the patch I just committed to CVS.

I just did it. It works! According to the query plan, only one partition
is being examined now.

Is this patch going to be included in 8.3 only, or in 8.2 as well?

Thanks!

Re: Problem with update on partitioned table

From

Tom Lane

Date:

24 March 2008, 19:40:42

Alex Solovey <a.solovey@gmail.com> writes:
> Is this patch going to be included in 8.3 only, or in 8.2 as well?

I thought I was already taking a chance by putting it in REL8_3_STABLE.
Since 8.2 doesn't have a regression compared to the previous release,
and is also very much more different from HEAD, I'm not inclined to
touch it.

            regards, tom lane