On Tue, 2025-04-01 at 22:21 -0500, Nathan Bossart wrote:
> It certainly feels risky. I was able to avoid executing the queries
> twice
> in all cases by saving the definition length in the TOC entry and
> skipping
> that many bytes the second time round.
Another idea that was under-discussed is whether the stats commands
should be in the TOC at all, or if they should be written as data
chunks.
Being in the TOC creates these issues with rewriting the TOC. Also, the
stats can be fairly large, especially for a wide table with a high
stats target, so the stats commands can increase the size of the TOC by
a lot.
But putting them in the data area doesn't seem quite right either,
because the data is just data, whereas the stats are a list of SQL
commands ("SELECT pg_restore_relation_stats(...); ..."). Also, if we
went down that road, we'd have to consider parallelism, which might
defeat the batching work that we're trying to do.
Regards,
Jeff Davis