Re: Statistics Import and Export - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Statistics Import and Export
Date
Msg-id CAExHW5uc7+Ed2J3jv1vQu063pSC7S7uP+HycCcPtrOZVLFYg1g@mail.gmail.com
Whole thread Raw
In response to Re: Statistics Import and Export  (Corey Huinker <corey.huinker@gmail.com>)
List pgsql-hackers
On Tue, Oct 31, 2023 at 12:55 PM Corey Huinker <corey.huinker@gmail.com> wrote:
>>
>>
>> Yeah, that use makes sense as well, and if so then postgres_fdw would likely need to be aware of the appropriate
queryfor several versions back - they change, not by much, but they do change. So now we'd have each query text in
threeplaces: a system view, postgres_fdw, and the bin/scripts pre-upgrade program. So I probably should consider the
bestway to share those in the codebase. 
>>
>
> Attached is v2 of this patch.
>
> New features:
> * imports index statistics. This is not strictly accurate: it re-computes index statistics the same as ANALYZE does,
whichis to say it derives those stats entirely from table column stats, which are imported, so in that sense we're
gettingindex stats without touching the heap. 
> * now support extended statistics except for MCV, which is currently serialized as an difficult-to-decompose bytea
field.
> * bare-bones CLI script pg_export_stats, which extracts stats on databases back to v12 (tested) and could work back
tov10. 
> * bare-bones CLI script pg_import_stats, which obviously only works on current devel dbs, but can take exports from
olderversions. 
>

I did a small experiment with your patches. In a separate database
"fdw_dst" I created a table t1 and populated it with 100K rows
#create table t1 (a int, b int);
#insert into t1 select i, i + 1 from generate_series(1, 100000) i;
#analyse t1;

In database "postgres" on the same server, I created a foreign table
pointing to t1
#create server fdw_dst_server foreign data wrapper postgres_fdw
OPTIONS ( dbname 'fdw_dst', port '5432');
#create user mapping for public server fdw_dst_server ;
#create foreign table t1 (a int, b int) server fdw_dst_server;

The estimates are off
#explain select * from t1 where a = 100;
                        QUERY PLAN
-----------------------------------------------------------
 Foreign Scan on t1  (cost=100.00..142.26 rows=13 width=8)
(1 row)

Export and import stats for table t1
$ pg_export_stats -d fdw_dst | pg_import_stats -d postgres

gives accurate estimates
#explain select * from t1 where a = 100;
                        QUERY PLAN
-----------------------------------------------------------
 Foreign Scan on t1  (cost=100.00..1793.02 rows=1 width=8)
(1 row)

In this simple case it's working like a charm.

Then I wanted to replace all ANALYZE commands in postgres_fdw.sql with
import and export of statistics. But I can not do that since it
requires table names to match. Foreign table metadata stores the
mapping between local and remote table as well as column names. Import
can use that mapping to install the statistics appropriately. We may
want to support a command or function in postgres_fdw to import
statistics of all the tables that point to a given foreign server.
That may be some future work based on your current patches.

I have not looked at the code though.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Jelte Fennema-Nio
Date:
Subject: Re: Add PQsendSyncMessage() to libpq
Next
From: Alvaro Herrera
Date:
Subject: Re: GUC names in messages