Re: BUG #15184: Planner overestimates number of rows in empty table - Mailing list pgsql-bugs

From Andrew Gierth
Subject Re: BUG #15184: Planner overestimates number of rows in empty table
Date
Msg-id 87k1sk95f2.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to BUG #15184: Planner overestimates number of rows in empty table  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #15184: Planner overestimates number of rows in empty table  (Alexey Ermakov <alexey.ermakov@dataegret.com>)
List pgsql-bugs
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:

 PG> if table is really empty then in first condition we set curpages =
 PG> 10 and second condition doesn't apply. so we estimate that empty
 PG> table has 10 pages and 2550 rows (for table with one int column)
 PG> which doesn't look good. is it intended behavior?

As the large comment immediately above explains, it is indeed intended
behavior. The reason is that over-estimating usually doesn't cause too
much harm, but under-estimating tends to blow things up in certain
critical cases (such as causing foreign-key checks to do sequential
scans on tables during a data-loading transaction).

It's actually still possible to trigger those kinds of pathological
cases, but in between the estimation hacks and the plan cache, you have
to work a lot harder at it. Consider for example:

create table tree (id integer primary key,
                   parent_id integer references tree);

insert into tree values (1, null);
vacuum analyze tree;  -- now relpages=1 reltuples=1
begin;
insert into tree select i, i-1 from generate_series(2,10) i;
insert into tree select i, i-1 from generate_series(11,100000) i;
commit;

That last insert could take maybe half an hour to run, because the FK
check has a query plan - established as a generic plan since the middle
insert ran it more than 5 times - with the small table size leading to a
sequential scan.

Without the vacuum analyze that I stuck in there, the code in plancat.c
avoids this problem by treating the table as large enough to require an
indexscan from the start.

As the comment says, this does mean we don't handle the case when the
table really is empty and stays empty. But this should be very rare
compared to the case where the table starts out empty but then has rows
added.

-- 
Andrew (irc:RhodiumToad)


pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #15185: pg_dump doesn't include statistics when specifying atable
Next
From: PG Bug reporting form
Date:
Subject: BUG #15186: how get data from db files