Re: FW: Constraint exclusion in partitions - Mailing list pgsql-general

From Daniel Begin
Subject Re: FW: Constraint exclusion in partitions
Date
Msg-id COL129-DS8E893F1B0C162429C355B94CF0@phx.gbl
Whole thread Raw
In response to Re: FW: Constraint exclusion in partitions  (Francisco Olarte <folarte@peoplecall.com>)
Responses Re: FW: Constraint exclusion in partitions
Re: FW: Constraint exclusion in partitions
List pgsql-general
Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process
completedand all the resulting tables analyzed. 

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested
themonly for a couple of values but in summary... 

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id
value1,id value2 ...) 

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key.
Usingan indexed field took so long compared to the old table that I cancelled the execution for the new one! 

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition
queriesare 
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause
-------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Enum in foreign table: error and correct way to handle.
Next
From: melvin6925
Date:
Subject: Re: FW: Constraint exclusion in partitions