Re: Statistics Import and Export - Mailing list pgsql-hackers

From Corey Huinker
Subject Re: Statistics Import and Export
Date
Msg-id CADkLM=dUGyQMK8Rhz1gVLeW8u2nBWOLvH7-j82MAAN_PmcSAQw@mail.gmail.com
Whole thread Raw
In response to Re: Statistics Import and Export  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Statistics Import and Export
Re: Statistics Import and Export
List pgsql-hackers

That’s certainly a fair point and my initial reaction (which could certainly be wrong) is that it’s unlikely to be an issue- but also, if you feel you could make it work with an array and passing all the attribute info in with one call, which I suspect would be possible but just a bit more complex to build, then sure, go for it. If it ends up being overly unwieldy then perhaps the  per-attribute call would be better and we could perhaps acquire the lock before the function calls..?  Doing a check to see if we have already locked it would be cheaper than trying to acquire a new lock, I’m fairly sure.

Well the do_analyze() code was already ok with acquiring the lock once for non-inherited stats and again for inherited stats, so the locks were already not the end of the world. However, that's at most a 2x of the locking required, and this would natts * x, quite a bit more. Having the procedures check for a pre-existing lock seems like a good compromise.
 
Also per our prior discussion- this makes sense to include in post-data section, imv, and also because then we have the indexes we may wish to load stats for, but further that also means it’ll be in the paralleliziable part of the process, making me a bit less concerned overall about the individual timing. 

The ability to parallelize is pretty persuasive. But is that per-statement parallelization or do we get transaction blocks? i.e. if we ended up importing stats like this:

BEGIN;
LOCK TABLE schema.relation IN SHARE UPDATE EXCLUSIVE MODE;
LOCK TABLE pg_catalog.pg_statistic IN ROW UPDATE EXCLUSIVE MODE;
SELECT pg_import_rel_stats('schema.relation', ntuples, npages);
SELECT pg_import_pg_statistic('schema.relation', 'id', ...);
SELECT pg_import_pg_statistic('schema.relation', 'name', ...);
SELECT pg_import_pg_statistic('schema.relation', 'description', ...);
...
COMMIT;

pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Synchronizing slots from primary to standby
Next
From: Michael Paquier
Date:
Subject: Re: Improve readability by using designated initializers when possible