Re: Row estimates for empty tables - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Row estimates for empty tables
Date
Msg-id CAFj8pRCK3oeZLBLr=8Z3VXpyyk1gashW1_7Cg2sr9FX9v_0OmQ@mail.gmail.com
Whole thread Raw
In response to Re: Row estimates for empty tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Row estimates for empty tables
List pgsql-hackers


so 25. 7. 2020 v 0:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
[ redirecting to -hackers ]

I wrote:
> The core issue here is "how do we know whether the table is likely to stay
> empty?".  I can think of a couple of more or less klugy solutions:

For these special cases is probably possible to ensure ANALYZE before any SELECT. When the table is created, then it is analyzed, and after that it is published and used for SELECT. Usually this table is not modified ever.

Because it is a special case, then it is not necessarily too sophisticated a solution. But for built in solution it can be designed more goneral



> 1. Arrange to send out a relcache inval when adding the first page to
> a table, and then remove the planner hack for disbelieving relpages = 0.
> I fear this'd be a mess from a system structural standpoint, but it might
> work fairly transparently.

I experimented with doing this.  It's not hard to code, if you don't mind
having RelationGetBufferForTuple calling CacheInvalidateRelcache.  I'm not
sure whether that code path might cause any long-term problems, but it
seems to work OK right now.  However, this solution causes massive
"failures" in the regression tests as a result of plans changing.  I'm
sure that's partly because we use so many small tables in the tests.
Nonetheless, it's not promising from the standpoint of not causing
unexpected problems in the real world.

> 2. Establish the convention that vacuuming or analyzing an empty table
> is what you do to tell the system that this state is going to persist.
> That's more or less what the existing comments in plancat.c envision,
> but we never made a definition for how the occurrence of that event
> would be recorded in the catalogs, other than setting relpages > 0.
> Rather than adding another pg_class column, I'm tempted to say that
> vacuum/analyze should set relpages to a minimum of 1, even if the
> relation has zero pages.

I also tried this, and it seems a lot more promising: no existing
regression test cases change.  So perhaps we should do the attached
or something like it.

I am sending a patch that is years used in GoodData.

I am not sure if the company uses 0 or 1, but I can ask.

Regards

Pavel



                        regards, tom lane

Attachment

pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: handle a ECPG_bytea typo
Next
From: Amit Kapila
Date:
Subject: Re: INSERT INTO SELECT, Why Parallelism is not selected?