Re: Row estimates for empty tables - Mailing list pgsql-general

From Tom Lane
Subject Re: Row estimates for empty tables
Date
Msg-id 578094.1595624973@sss.pgh.pa.us
Whole thread Raw
In response to Re: Row estimates for empty tables  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Row estimates for empty tables
List pgsql-general
Pavel Stehule <pavel.stehule@gmail.com> writes:
> pá 24. 7. 2020 v 16:38 odesílatel Christophe Pettus <xof@thebuild.com>
> napsal:
>> Since we already special-case parent tables for partition sets, would a
>> storage parameter that lets you either tell the planner "no, really, zero
>> is reasonable here" or sets a minimum number of rows to plan for be
>> reasonable?

> It is an issue for special, not typical applications (this situation is
> typical for some OLAP patterns)  - it is not too often - but some clean
> solution (instead hacking postgres) can be nice.

The core issue here is "how do we know whether the table is likely to stay
empty?".  I can think of a couple of more or less klugy solutions:

1. Arrange to send out a relcache inval when adding the first page to
a table, and then remove the planner hack for disbelieving relpages = 0.
I fear this'd be a mess from a system structural standpoint, but it might
work fairly transparently.

2. Establish the convention that vacuuming or analyzing an empty table
is what you do to tell the system that this state is going to persist.
That's more or less what the existing comments in plancat.c envision,
but we never made a definition for how the occurrence of that event
would be recorded in the catalogs, other than setting relpages > 0.
Rather than adding another pg_class column, I'm tempted to say that
vacuum/analyze should set relpages to a minimum of 1, even if the
relation has zero pages.  That does get the job done:

regression=# create table foo(f1 text);
CREATE TABLE
regression=# explain select * from foo;
                       QUERY PLAN                       
--------------------------------------------------------
 Seq Scan on foo  (cost=0.00..23.60 rows=1360 width=32)
(1 row)

regression=# vacuum foo;  -- doesn't help
VACUUM
regression=# explain select * from foo;
                       QUERY PLAN                       
--------------------------------------------------------
 Seq Scan on foo  (cost=0.00..23.60 rows=1360 width=32)
(1 row)
regression=# update pg_class set relpages = 1 where relname = 'foo';
UPDATE 1
regression=# explain select * from foo;
                     QUERY PLAN                     
----------------------------------------------------
 Seq Scan on foo  (cost=0.00..0.00 rows=1 width=32)
(1 row)

(We're still estimating one row, but that's as a result of different
decisions that I'm not nearly as willing to compromise on...)

            regards, tom lane



pgsql-general by date:

Previous
From: Scott Ribe
Date:
Subject: bad JIT decision
Next
From: Ted Toth
Date:
Subject: Re: when is RLS policy applied