Re: Inaccurate row count estimation - Mailing list pgsql-general

From Vyacheslav Kalinin
Subject Re: Inaccurate row count estimation
Date
Msg-id 9b1af80e0807150741u41968008rd8ccc66a475ac03@mail.gmail.com
Whole thread Raw
In response to Re: Inaccurate row count estimation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Thanks for the reply, Tom.

After tracing through this I see that the problem is that we don't have
statistics for inheritance trees, and so you're getting a default
estimate for the selectivity of the join condition. 

I might be wrong but I suspect that the inheritance is not the only reason here. If I change the table definitions to:

create table pinfo_p00 (
  pid integer,
  constraint pk_pinfo_p00 primary key (pid),
  constraint cc_pinfo_p00_pid check(pid > 0 and pid < 100000)
);

create table pinfo_p01 (
  pid integer,
  constraint pk_pinfo_p01 primary key (pid),
  constraint cc_pinfo_p01_pid check(pid >= 100000 and pid < 200000)
);

and create a view pinfo, or just do a query with subselect:

explain analyze
select *
  from contacts c
  left join (
      select * from pinfo_p00
      union all
      select * from pinfo_p01
  ) pi on (pi.pid = c.cpid)
 where c.pid = 200 ;

the row-count assessment doesn't seem to be different:

QUERY PLAN
Nested Loop Left Join  (cost=4.56..514.25 rows=3896 width=16) (actual time=0.125..3.976 rows=40 loops=1)
  Join Filter: (pinfo_p00.pid = c.cpid)
  ->  Bitmap Heap Scan on contacts c  (cost=4.56..100.34 rows=39 width=12) (actual time=0.069..0.421 rows=40 loops=1)
        Recheck Cond: (pid = 200)
        ->  Bitmap Index Scan on ix_contacts_pid  (cost=0.00..4.55 rows=39 width=0) (actual time=0.042..0.042 rows=40 loops=1)
              Index Cond: (pid = 200)
  ->  Append  (cost=0.00..10.59 rows=2 width=4) (actual time=0.033..0.061 rows=1 loops=40)
        ->  Index Scan using pk_pinfo_p00 on pinfo_p00  (cost=0.00..5.29 rows=1 width=4) (actual time=0.011..0.015 rows=0 loops=40)
              Index Cond: (pinfo_p00.pid = c.cpid)
        ->  Index Scan using pk_pinfo_p01 on pinfo_p01  (cost=0.00..5.29 rows=1 width=4) (actual time=0.012..0.015 rows=0 loops=40)
              Index Cond: (pinfo_p01.pid = c.cpid)
Total runtime: 4.341 ms

It scares me a bit as it seems that innocent-looking combination of union's and join's could destroy the subsequent plan completely.

pgsql-general by date:

Previous
From: Josh Berkus
Date:
Subject: Re: [pgsql-advocacy] [pdxpug] Pg booth staffing at OSCON
Next
From: "Scott Marlowe"
Date:
Subject: Re: Installing PostgreSQL without using CygWin