Re: Import Statistics in postgres_fdw before resorting to sampling. - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Import Statistics in postgres_fdw before resorting to sampling.
Date
Msg-id CAPmGK17KWkMTOMnB_qTy+8aJ9zz4rUPHLBAUyP4cHeUOxzM5sQ@mail.gmail.com
Whole thread Raw
In response to Re: Import Statistics in postgres_fdw before resorting to sampling.  (Corey Huinker <corey.huinker@gmail.com>)
Responses Re: Import Statistics in postgres_fdw before resorting to sampling.
List pgsql-hackers
On Tue, Mar 31, 2026 at 5:04 AM Corey Huinker <corey.huinker@gmail.com> wrote:

>> postgres_fdw side:
>>
>> * In fetch_remote_statistics, if we get reltuples=0 for v14 or later,
>> I think we should update only the relation stats with that info, and
>> avoid resorting to analyzing, for efficiency, as I proposed before.  I
>> modified that function (and import_fetched_statistics) that way.
>
> This will miss out on the case where the remote table did get analyzed once, when empty, but now isn't empty. I
realizethat shouldn't happen very often, but the cost of rowsampling a table that is empty is very low. 

I think that that would be the user's fault, as it's the user's
responsibility to ensure that the existing stats for the remote table
are up-to-date.  From another perspective, not all users will be able
to operate in such a way, so I'm thinking of disabling this feature by
default.

> I see that remote_analyze didn't make it as a part of this patch. Is that something you'd repackaged as a follow-on
patch,or are you just done with it? 

As just reviewing/polishing the 0001/0002 patches is a lot of work, I
didn't have time to look at the remote_analyze patch.  We are running
out of time, so I'm afraid that I won't be able to have time for that.

I modified the patch further:

* Modified postgresImportStatistics to create RemoteAttributeMapping if needed.

* The query executed in fetch_relstats is almost the same as the one
executed in postgresGetAnalyzeInfoForForeignTable.  To avoid code
duplication, I modified it to use the latter query.  I also changed it
to use PQsendQuery, not PQsendQueryParams, for efficiency.

* Modified import_spi_query_ok to get the result of an import query by
using SPI_getbinval, not SPI_getvalue, for efficiency.

Attached is a new version of the patch.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Adding REPACK [concurrently]
Next
From: Alvaro Herrera
Date:
Subject: Re: Adding REPACK [concurrently]