Home > mailing lists

Re: incorrect row estimates for primary key join - Mailing list pgsql-performance

From	Julien Cigar
Subject	Re: incorrect row estimates for primary key join
Date	June 25, 2013 14:44:11
Msg-id	51C9ACAC.5080805@ulb.ac.be Whole thread Raw
In response to	incorrect row estimates for primary key join (Ben <midfield@gmail.com>)
List	pgsql-performance

Tree view

On 06/25/2013 00:18, Ben wrote:
> hello postgresql experts --
>
> i have a strange row estimate problem, which unfortunately i have trouble reproducing for anything but very large
tableswhich makes boiling down to a simple example hard.  i'm using version 9.1.1, all tables have had analyze run on
them.
>
> here's the example : i have a large table (billions of rows) with a five column primary key and a sixth value column.
for confidentiality purposes i can't reproduce the exact schema here, but it is essentially 
>
> create table bigtable (
>    id1 integer not null,
>    id2 date not null,
>    id3 integer not null,
>    id4 time not null,
>    id5 integer not null,
>    value real not null,
>    primary key (id1, id2, id3, id4, id5)
> );
>
> for various reasons there is only one id1 in the table right now, though in the future there will be more; also the
primarykey was added after table creation with alter table, though that shouldn't matter. 
>
> i need to select out a somewhat arbitrary collection of rows out of bigtable.  to do so i generate a temporary table
>
> create table jointable (
>    id1 integer not null,
>    id2 date not null,
>    id3 integer not null,
>    id4 time not null,
>    id5 integer not null
> );
>
> and then perform a join against this table.
>
> if jointable doesn't have many rows, the planner picks a nested loop over jointable and a primary key lookup on
bigtable. in the following, for expository purposes, jointable has 10 rows.  we can see the planner knows this. 
>
> explain select * from bigtable join jointable using (id1, id2, id3, id4, id5);
>
>   Nested Loop  (cost=0.00..6321.03 rows=145 width=28)
>     ->  Seq Scan on jointable  (cost=0.00..1.10 rows=10 width=24)
>     ->  Index Scan using bigtable_pkey on bigtable  (cost=0.00..631.97 rows=1 width=28)
>           Index Cond: ((id1 = jointable.id1) AND (id2 = jointable.id2) AND (id3 = jointable.id3) AND (id4 =
jointable.id4)AND (vid = foo.vid)) 
> (4 rows)
>
> as you increase the number of rows in jointable, the planner switches to a sort + merge.  in this case jointable has
roughly2 million rows. 
>
>   Merge Join  (cost=727807979.29..765482193.16 rows=18212633 width=28)
>    Merge Cond: ((bigtable.id1 = jointabe.id1) AND (bigtable.id2 = jointable.id2) AND (bigtable.id3 = jointable.id3)
AND(bigtable.id4 = bigtable.id4) AND (bigtable.id5 = bigtable.id5)) 
>     ->  Sort  (cost=727511795.16..735430711.00 rows=3167566336 width=28)
>           Sort Key: bigtable.id3, bigtable.id1, bigtable.id2, bigtable.id4, bigtable.id5
>           ->  Seq Scan on bigtable  (cost=0.00..76085300.36 rows=3167566336 width=28)
>     ->  Materialize  (cost=295064.70..305399.26 rows=2066911 width=24)
>           ->  Sort  (cost=295064.70..300231.98 rows=2066911 width=24)
>                 Sort Key: jointable.id3, jointable.id1, jointable.id2, jointable.id4, jointable.id5
>                 ->  Seq Scan on jointable  (cost=0.00..35867.11 rows=2066911 width=24)
> (9 rows)

can you show us the explain analyze version ?

> the choice of sort + merge is really bad here, given the size of bigtable (3 billion rows and counting.)
>
> some questions :
>
> 1 - first off, why isn't the sort happening on the primary key, so that bigtable does not have to be sorted?
>
> 2 - more importantly, since we are joining on the primary key, shouldn't the row estimate coming out of the join be
limitedby the number of rows in jointable? 
>
> for example, it is strange to see that if i put in a non-limiting limit statement (something bigger than the number
ofrows in jointable) it switches back to a nested loop + index scan : 
>
> explain select * from bigtable join jointable using (id1, id2, id3, id4, id5) limit 2500000;
>
>   Limit  (cost=0.00..178452647.11 rows=2500000 width=28)
>     ->  Nested Loop  (cost=0.00..1306127545.35 rows=18297957 width=28)
>           ->  Seq Scan on jointable  (cost=0.00..35867.11 rows=2066911 width=24)
>           ->  Index Scan using bigtable_pkey on bigtable  (cost=0.00..631.88 rows=1 width=28)
>                 Index Cond: ((id1 = jointable.id1) AND (id2 = jointable.id2) AND (id3 = jointable.id3) AND (id4 =
jointable.id4)AND (id5 = jointable.id5)) 
> (5 rows)
>
> am i not understanding the query planner, or is this a known issue in the query planner, or have i stumbled onto
somethingamiss?  unfortunately any synthetic examples i was able to make (up to 10 million rows) did not exhibit this
behavior,which makes it hard to test. 
>
> best regards, ben
>
>
>


--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

pgsql-performance by date:

From: Kevin Grittner
Date: 25 June 2013, 13:20:35
Subject: Re: incorrect row estimates for primary key join

From: Ben
Date: 25 June 2013, 18:27:46
Subject: Re: incorrect row estimates for primary key join

Re: incorrect row estimates for primary key join - Mailing list pgsql-performance

Previous

Next