Re: Import Statistics in postgres_fdw before resorting to sampling. - Mailing list pgsql-hackers
| From | Ashutosh Bapat |
|---|---|
| Subject | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Date | |
| Msg-id | CAExHW5u6ue1hMqjAubLAbz_ZQqZRdnwJAQtWUw=b+3NXTzYy-A@mail.gmail.com Whole thread Raw |
| In response to | Re: Import Statistics in postgres_fdw before resorting to sampling. (Corey Huinker <corey.huinker@gmail.com>) |
| Responses |
Re: Import Statistics in postgres_fdw before resorting to sampling.
|
| List | pgsql-hackers |
On Fri, Jan 23, 2026 at 3:50 AM Corey Huinker <corey.huinker@gmail.com> wrote: > > On Thu, Jan 22, 2026 at 5:16 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: >> >> On Thu, Jan 22, 2026 at 2:21 AM Corey Huinker <corey.huinker@gmail.com> wrote: >> >> >> >> Changes in this release, aside from rebasing: >> >> >> >> - The generic analyze and fdw.h changes are in their own patch (0001) that ignores contrib/postgres_fdw entirely. >> >> - The option for remote_analyze has been moved to its own patch (0003). >> >> - The errors raised are now warnings, to ensure that we can always fall back to row sampling. >> >> - All local attributes with attstatarget > 0 must get matching remote statistics or the import is considered a failure. >> >> - The pg_restore_attribute_stats() call has been turned into a prepared statement, for clarity and some minor parsingsavings. >> >> - The calls to pg_restore_relation_stats() are parameterized, but not prepared as this is rarely called more than once. >> >> - postgresStatisticsAreImportable will now disqualify a table if has extended statistics objects, because we can'tcompute those without a row sample. >> > >> >> Thanks Corey for breaking down these patches. It makes reviewing easier. >> >> analyze_rel() and acquire_inherited_sample_rows() both call >> fdwroutine->AnalyzeForeignTable() but only the first one uses the >> statistics import facility. Is that intentional? Typical use case of >> sharding will create a partitioned table with foreign tables as >> partitions. The partitions will be analyzed by the second function. >> Thus a big use case of postgres_fdw won't be able to use the import >> statistics facility. That seems like a major drawback of this patch. >> Thinking more about it, acquire_inherited_sample_rows() accumulates >> the sample rows from the child tables and extracts statistics from >> those rows and then updates corresponding pg_statistics rows. Doing >> that through import statistics seems a bit tricky since we need to be >> able to combine statistics from multiple relations. Can we do that? > > > We can't synthesize sample rows from imported statistics, no. > >> >> There's an advantage if we can combine stats across multiple relations >> - we don't have to sample children twice when analyzing the parent >> without ONLY. Instead we could produce parent statistics by combining >> statistics across children and the parent. To me this looks like >> altogether a different beast just like partial aggregates. > > > I think this patch is only ever going to get us out of 1 of the 2 samples, which isn't ideal but it is a savings. > I am not suggesting to synthesize sample rows. Calculate the statistics of the parent table from that of its children. >> >> >> It will be good to fix this drawback. If not, at least we should >> figure out (plan/POC) how to deal with the child tables? We need to at >> least document this drawback - the documentation in the current patch >> reads as if all foreign tables will use this facility when available. > > > Yes, we will have to note the limitation. I have made that note, as well as the documentation fix attached. The note just mentions partition table but the limitation applies to any foreign child table. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: