Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete - Mailing list pgsql-bugs

From Peter J. Holzer
Subject Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete
Date
Msg-id 20151218182712.GD362@hjp.at
Whole thread Raw
In response to Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete  (Marcin Sieńko <sienkomarcin@gmail.com>)
List pgsql-bugs
On 2015-12-17 16:19:24 +0100, Marcin Sie=C5=84ko wrote:
> I've done it already. I've also already check rewrited query. No changes.=
 But i
> have a idea. There is generated base and there are "normal" shipment_orde=
r_item
> with 3 - 100 maybe 10000 shipment_order_sub_item but one has 3 992 102. C=
ould
> it be a problem?

A very skewed distribution can be a problem, yes. If the planner doesn't
know whether you will select a shipment_order_item with 3 subitems or
one with 3 million subitems, it can only use the average number of
subitems - which may be bad for either case.

But your plan looks worse than a single outlier of 3992102.

The nested loop semi join in line 3 of http://explain.depesz.com/s/Rus
(I'm using that as the reference because it's easier to read than your
line-wrapped plan and because the lines are numbered and easier to
reference) is expected to return 1 row. But the nested loop at line 2 is
expected to return 3992515 rows. Unless I'm missing something this means
that planner thinks that there are *on average* 3992515 subitems per
item.=20

Can you post the output of these queries?

select * from pg_stats
where tablename=3D'shipment_order_sub_item' and attname=3D'shipment_order_i=
tem_id';

select count(*) from shipment_order_sub_item;

select count(distinct(shipment_order_item_id))
=66rom shipment_order_sub_item;


> If yes that is my bad to ask You because I'belive there will not occur in
> production and counting could bye match faster.

Yeah, if you have unrealistic test data, it will produce unrealistic
results and you can probably ignore them. Although in my experience it
is usually the other way around: The test data has a nice, regular
distribution and the production data is skewed a lot more than anyone
expected.

    hp

--=20
   _  | Peter J. Holzer    | I want to forget all about both belts and
|_|_) |                    | suspenders; instead, I want to buy pants=20
| |   | hjp@hjp.at         | that actually fit.
__/   | http://www.hjp.at/ |   -- http://noncombatant.org/

pgsql-bugs by date:

Previous
From: "Millepied, Pascal (GE Healthcare)"
Date:
Subject: Re: Known issues on PostgreSQL server 8.1.19
Next
From: "Peter J. Holzer"
Date:
Subject: Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete