Re: Import Statistics in postgres_fdw before resorting to sampling. - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Import Statistics in postgres_fdw before resorting to sampling.
Date
Msg-id CAPmGK17+FS99X7t8_T2pL-seSOP_dYoQi7jOhshf8LKFzJgrwg@mail.gmail.com
Whole thread Raw
In response to Re: Import Statistics in postgres_fdw before resorting to sampling.  (Corey Huinker <corey.huinker@gmail.com>)
Responses Re: Import Statistics in postgres_fdw before resorting to sampling.
List pgsql-hackers
On Sat, Mar 14, 2026 at 12:36 AM Corey Huinker <corey.huinker@gmail.com> wrote:
> On Fri, Mar 13, 2026 at 8:17 AM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
>> On Fri, Mar 13, 2026 at 4:42 AM Corey Huinker <corey.huinker@gmail.com> wrote:
>> >>
>> >> By the errors-to-warnings change, even when those libpq functions
>> >> fail, we fall back to the normal ANALYZE processing, but I don't think
>> >> that that is OK, because if those functions fail for some reasons
>> >> (network-related issues like network disconnection or memory-related
>> >> issues like out of memory), then libpq functions called later for the
>> >> normal processing would also be likely to fail for the same reasons,
>> >> causing the same failure again, which I don't think is good.  To avoid
>> >> that, when libpq functions in fetch_relstats() and fetch_attstats()
>> >> fail, shouldn't we just throw an error, as before?
>>
>> > Right now the code doesn't differentiate on the kind of an error that was encountered. If it was a network error,
thennot only do we expect fallback to also fail, but simple fdw queries will fail as well. If, however it were a
permissionissue (no SELECT permission on pg_stats, for instance, or pg_stats doesn't exist because the remote database
isnot a regular Postgres like redshift or something) then I definitely want to fall back. Currently, the code makes no
judgementon the details of the error, it just trusts that fallback-analyze will either succeed (because it was
permissionsrelated) or it will quickly encounter the same insurmountable effort, and one extra PQsendQuery isn't that
muchoverhead. 
>>
>> I'm not sure if that's really true; if the error is the one that
>> doesn't take a time to detect, the extra function call wouldn't
>> probably be a problem, but if the error is the one that takes a long
>> time to detect, the function call would make the situation worse.
>
> I'm struggling to think of what kind of error that would be that would take a long time to detect. Network timeouts
couldbe one thing, but such a situation would affect normal query operations as well, not just analyzes. If you have a
specificerror in mind, please let me know. 

I too was thinking of network timeouts (and in such a case I thought
that even if PQsendQuery was run successfully, the following
PQgetResult would have to block for a long time).  And yes, that
situation can occur for normal queries, but in such a query, the extra
functions call is done only when aborting the remote transaction, in a
way libpq functions don't have to wait for a network timeout.

>> > If you think that inspecting the error that we get and matching against a list of errors that should skip the
retryor skip the fallback, I'd be in favor of that. It's probably easier to start with a list of errorcodes that we
feelare hopeless and should remain ERRORs not WARNINGs. 
>>
>> That's an improvement, but I think that that's nice-to-have, not
>> must-have.  As there's not much time left until the feature freeze,
>> I'd like to propose to make the first cut as simple as possible and
>> thus leave that for future work, if this is intended for v19.  If no
>> objections from you (or anyone else), I'd like to update the patch as
>> I proposed.
>
> I'd like to get this into 19 as well, so I'm likely to agree to your proposal.

Ok, will update the patch.

>> Another thing I noticed is related to this comment in fetch_relstats():
>>
>> +   /*
>> +    * Before v14, a reltuples value of 0 was ambiguous: it could either mean
>> +    * the relation is empty, or it could mean that it hadn't yet been
>> +    * vacuumed or analyzed.  (Newer versions use -1 for the latter case).
>> +    *
>> +    * We can ignore this change, because if the remote table wasn't analyzed,
>> +    * then it would have no attribute stats, and thus we wouldn't have stats
>> +    * that we would try to import. So we can take the reltuples value as-is.
>> +    */
>>
>> I might have misread the patch, but for v14 or later the patch tries
>> to import remote stats even when we have reltuples = -1.  Is that
>> intentional?  In that case the remote table has never yet been
>> vacuumed/analyzed, so I think we should just fall back to the normal
>> processing.
>
> For cases where we affirmatively have -1, it makes sense to immediately go to the ANALYZE option (if available) and
fallback if not. I'll make that change. 
>
>>   Also, for v13 or earlier it also tries to import remote
>> stats even when we have reltuples = 0, but in that case the remote
>> table might not have been vacuumed/analyzed, so to avoid possibly
>> useless processing, I think we should just fall back to it in that
>> case as well.
>
> +1

I forgot to mention this, but when we have reltuples = 0 for v14 or
later, the patch tries to import both relstats and attstats, but in
that case I think it would be OK to do so only for the former for the
consistency with the normal processing.  What do you think about that?

Best regards,
Etsuro Fujita



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Report bytes and transactions actually sent downtream
Next
From: Alexander Lakhin
Date:
Subject: Re: pg_plan_advice