Thread: Problem with update on partitioned table
Hello, We have pretty big production database (running PostgreSQL 8.3.1) with many partitioned tables. In most cases, they work well (since 8.2.1 at least) -- constraint exclusion is able to select correct partitions. However, there is an exception: queries on partitioned tables using PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension. Here is a simple test case: ------------------ CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY ); INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 ); CREATE TABLE foo ( part INT NOT NULL ,foo_data INT ,bar_id INT NOT NULL REFERENCES bar( bar_id ) ); CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo ); INSERT INTO foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 ); CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo ); INSERT INTO foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 ); CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo ); INSERT INTO foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 ); ------------------ As you can see, table "Foo" is partitioned by column "part". If only "Foo" is referenced in update, query plan is fine: => EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2; QUERY PLAN ------------------------------------------------------------------ Append (cost=0.00..68.50 rows=20 width=14) -> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14) Filter: (part = 2) -> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14) Filter: (part = 2) (5 rows) However, for this query it is far from being optimal: => EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND foo.bar_id = bar.bar_id; QUERY PLAN -------------------------------------------------------------------------------- Append (cost=0.00..nan rows=22 width=14) -> Nested Loop (cost=0.00..73.05 rows=10 width=14) -> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14) Filter: (part = 2) -> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1 width=4) Index Cond: (bar.bar_id = public.foo.bar_id) -> Merge Join (cost=nan..nan rows=1 width=8) Merge Cond: (public.foo.bar_id = bar.bar_id) -> Sort (cost=0.02..0.03 rows=1 width=0) Sort Key: public.foo.bar_id -> Result (cost=0.00..0.01 rows=1 width=0) One-Time Filter: false -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: bar.bar_id -> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4) -> Nested Loop (cost=0.00..73.05 rows=10 width=14) -> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14) Filter: (part = 2) -> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1 width=4) Index Cond: (bar.bar_id = public.foo.bar_id) -> Merge Join (cost=nan..nan rows=1 width=8) Merge Cond: (public.foo.bar_id = bar.bar_id) -> Sort (cost=0.02..0.03 rows=1 width=0) Sort Key: public.foo.bar_id -> Result (cost=0.00..0.01 rows=1 width=0) One-Time Filter: false -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: bar.bar_id -> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4) (29 rows) Is there any way to avoid this anomaly? UPDATE ... FROM is very convenient if you have to update rows depending on conditions involving multiple tables. In addition, with partitioned tables, standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo, bar...) is even worse because query planner cannot choose correct partitions without nested select results and so it does a complete scan of all partitions instead. Alex
Alex- http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS TRAINT-EXCLUSION postgresql.conf contains a constraint_exclusion parameter called constraint_exclusion (boolean) which if you dont want to scan ALL partitions must be set to 'on' constraint_exclusion = on HTH Martin ----- Original Message ----- From: "Alex Solovey" <a.solovey@gmail.com> To: <pgsql-general@postgresql.org> Sent: Monday, March 24, 2008 2:15 PM Subject: [GENERAL] Problem with update on partitioned table > Hello, > > We have pretty big production database (running PostgreSQL 8.3.1) with > many partitioned tables. In most cases, they work well (since 8.2.1 at > least) -- constraint exclusion is able to select correct partitions. > However, there is an exception: queries on partitioned tables using > PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension. > > Here is a simple test case: > ------------------ > CREATE TABLE bar ( bar_id INT NOT NULL PRIMARY KEY ); > INSERT INTO bar VALUES ( 1 ), ( 2 ), ( 3 ); > > CREATE TABLE foo ( > part INT NOT NULL > ,foo_data INT > ,bar_id INT NOT NULL REFERENCES bar( bar_id ) > ); > > CREATE TABLE foo_1 ( CHECK ( part = 1 ) ) INHERITS ( foo ); > INSERT INTO foo_1 ( part, bar_id ) VALUES ( 1, 1 ), ( 1, 3 ); > > CREATE TABLE foo_2 ( CHECK ( part = 2 ) ) INHERITS ( foo ); > INSERT INTO foo_2 ( part, bar_id ) VALUES ( 2, 2 ), ( 2, 3 ); > > CREATE TABLE foo_3 ( CHECK ( part = 3 ) ) INHERITS ( foo ); > INSERT INTO foo_3 ( part, bar_id ) VALUES ( 3, 1 ), ( 3, 2 ); > ------------------ > > As you can see, table "Foo" is partitioned by column "part". If only > "Foo" is referenced in update, query plan is fine: > > => EXPLAIN UPDATE foo SET foo_data = 10 WHERE part = 2; > QUERY PLAN > ------------------------------------------------------------------ > Append (cost=0.00..68.50 rows=20 width=14) > -> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14) > Filter: (part = 2) > -> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14) > Filter: (part = 2) > (5 rows) > > However, for this query it is far from being optimal: > > => EXPLAIN UPDATE foo SET foo_data = 10 FROM bar WHERE part = 2 AND > foo.bar_id = bar.bar_id; > QUERY PLAN > > -------------------------------------------------------------------------- ------ > Append (cost=0.00..nan rows=22 width=14) > -> Nested Loop (cost=0.00..73.05 rows=10 width=14) > -> Seq Scan on foo (cost=0.00..34.25 rows=10 width=14) > Filter: (part = 2) > -> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1 > width=4) > Index Cond: (bar.bar_id = public.foo.bar_id) > -> Merge Join (cost=nan..nan rows=1 width=8) > Merge Cond: (public.foo.bar_id = bar.bar_id) > -> Sort (cost=0.02..0.03 rows=1 width=0) > Sort Key: public.foo.bar_id > -> Result (cost=0.00..0.01 rows=1 width=0) > One-Time Filter: false > -> Sort (cost=168.75..174.75 rows=2400 width=4) > Sort Key: bar.bar_id > -> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4) > -> Nested Loop (cost=0.00..73.05 rows=10 width=14) > -> Seq Scan on foo_2 foo (cost=0.00..34.25 rows=10 width=14) > Filter: (part = 2) > -> Index Scan using bar_pkey on bar (cost=0.00..3.87 rows=1 > width=4) > Index Cond: (bar.bar_id = public.foo.bar_id) > -> Merge Join (cost=nan..nan rows=1 width=8) > Merge Cond: (public.foo.bar_id = bar.bar_id) > -> Sort (cost=0.02..0.03 rows=1 width=0) > Sort Key: public.foo.bar_id > -> Result (cost=0.00..0.01 rows=1 width=0) > One-Time Filter: false > -> Sort (cost=168.75..174.75 rows=2400 width=4) > Sort Key: bar.bar_id > -> Seq Scan on bar (cost=0.00..34.00 rows=2400 width=4) > (29 rows) > > Is there any way to avoid this anomaly? UPDATE ... FROM is very > convenient if you have to update rows depending on conditions involving > multiple tables. In addition, with partitioned tables, > standard-conforming UPDATE foo ... WHERE pk IN (SELECT pk FROM foo, > bar...) is even worse because query planner cannot choose correct > partitions without nested select results and so it does a complete scan > of all partitions instead. > > Alex > > - > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Alex Solovey <a.solovey@gmail.com> writes: > We have pretty big production database (running PostgreSQL 8.3.1) with > many partitioned tables. In most cases, they work well (since 8.2.1 at > least) -- constraint exclusion is able to select correct partitions. > However, there is an exception: queries on partitioned tables using > PostgreSQL 'UPDATE Foo ... FROM Bar' syntax extension. Hmm, the immediate problem is that cost_mergejoin is coming out with a silly cost (NaN) because of division by zero. The attached patch should get it back to 8.2-equivalent behavior. But really we're missing a bet because the sub-joins ought to get discarded entirely when we know they must be empty. There are various places testing for this but it looks like make_join_rel() needs to do it too. regards, tom lane Index: src/backend/optimizer/path/costsize.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v retrieving revision 1.191 diff -c -r1.191 costsize.c *** src/backend/optimizer/path/costsize.c 1 Jan 2008 19:45:50 -0000 1.191 --- src/backend/optimizer/path/costsize.c 24 Mar 2008 20:55:42 -0000 *************** *** 1385,1390 **** --- 1385,1396 ---- Selectivity joininfactor; Path sort_path; /* dummy for result of cost_sort */ + /* Protect some assumptions below that rowcounts aren't zero */ + if (outer_path_rows <= 0) + outer_path_rows = 1; + if (inner_path_rows <= 0) + inner_path_rows = 1; + if (!enable_mergejoin) startup_cost += disable_cost;
Martin, > which if you dont want to scan ALL partitions must be set to 'on' > constraint_exclusion = on It is 'ON'. The problem is that it does not work well for 'UPDATE foo ... FROM bar' queries, when partitioned table 'foo' is joined with another table. Martin Gainty wrote: > Alex- > > http://www.postgresql.org/docs/8.2/static/runtime-config-query.html#GUC-CONS > TRAINT-EXCLUSION > postgresql.conf contains a constraint_exclusion parameter called > constraint_exclusion (boolean) > which if you dont want to scan ALL partitions must be set to 'on' > constraint_exclusion = on > > HTH > Martin
Tom, Thanks for the patch. We've tried it here, and it improved query plan slightly (indeed, it looks exactly like the plan from 8.2.6 now). But, as you've said, empty sub-joins are still not discarded, so query execution time did not improve. And this is the same in both 8.2 and 8.3. Note that only 'UPDATE FROM' does not discard sub-joins. 'SELECT' on the same tables is just fine: => EXPLAIN SELECT * FROM foo, bar WHERE part = 2 AND foo.bar_id = bar.bar_id; QUERY PLAN ----------------------------------------------------------------------------- Hash Join (cost=64.47..104.08 rows=18 width=16) Hash Cond: (bar.bar_id = public.foo.bar_id) -> Seq Scan on bar (cost=0.00..31.40 rows=2140 width=4) -> Hash (cost=64.25..64.25 rows=18 width=12) -> Append (cost=0.00..64.25 rows=18 width=12) -> Seq Scan on foo (cost=0.00..32.12 rows=9 width=12) Filter: (part = 2) -> Seq Scan on foo_2 foo (cost=0.00..32.12 rows=9 width=12) Filter: (part = 2) (9 rows) -- Alex Tom Lane wrote: > Hmm, the immediate problem is that cost_mergejoin is coming out with > a silly cost (NaN) because of division by zero. The attached patch > should get it back to 8.2-equivalent behavior. But really we're missing > a bet because the sub-joins ought to get discarded entirely when we know > they must be empty. There are various places testing for this but it > looks like make_join_rel() needs to do it too. > > regards, tom lane
Alex Solovey <a.solovey@gmail.com> writes: > Thanks for the patch. We've tried it here, and it improved query plan > slightly (indeed, it looks exactly like the plan from 8.2.6 now). > But, as you've said, empty sub-joins are still not discarded, so query > execution time did not improve. And this is the same in both 8.2 and 8.3. If you're feeling brave, try the patch I just committed to CVS. regards, tom lane
Tom, > If you're feeling brave, try the patch I just committed to CVS. I just did it. It works! According to the query plan, only one partition is being examined now. Is this patch going to be included in 8.3 only, or in 8.2 as well? Thanks!
Alex Solovey <a.solovey@gmail.com> writes: > Is this patch going to be included in 8.3 only, or in 8.2 as well? I thought I was already taking a chance by putting it in REL8_3_STABLE. Since 8.2 doesn't have a regression compared to the previous release, and is also very much more different from HEAD, I'm not inclined to touch it. regards, tom lane