Re: Statistics Import and Export - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Statistics Import and Export
Date
Msg-id ec43b87247fd8700c6fcf4fe0b68cdb2fafecf2d.camel@j-davis.com
Whole thread Raw
In response to Re: Statistics Import and Export  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Statistics Import and Export
Re: Statistics Import and Export
List pgsql-hackers
On Tue, 2025-04-01 at 22:21 -0500, Nathan Bossart wrote:
> It certainly feels risky.  I was able to avoid executing the queries
> twice
> in all cases by saving the definition length in the TOC entry and
> skipping
> that many bytes the second time round.

That feels like a better approach.

>   That's simple enough, but it relies
> on various assumptions such as fseeko() being available (IIUC the
> file will
> only be open for writing so we cannot fall back on fread()) and
> WriteStr()
> returning an accurate value (which I'm skeptical of because some
> formats
> compress this data).  But AFAICT custom format is the only format
> that does
> a second WriteToc() pass at the moment, and it only does so when
> fseeko()
> is usable.

Even with those assumptions, I think it's much better than querying
twice and assuming that the results are the same.

>   Plus, custom format doesn't appear to compress anything written
> via WriteStr().

If WriteStr() was doing compression, that would make the second
WriteToc() pass to update the data offsets scary even in the existing
code.

> We might be able to improve this by inventing a new callback that
> fails for
> all formats except for custom with feesko() available.  That would at
> least
> ensure hard failures if these assumptions change.  That problably
> wouldn't
> be terribly invasive.  I'm curious what you think.

That sounds fine, I'd say do that if it feels reasonable, and if the
extra callbacks get too messy, we can just document the assumptions
instead.

>
> Hm.  One thing we could do is to send the TocEntry to the callback
> and
> verify that matches the one we were expecting to see next (as set by
> a
> previous call).  Does that sound like a strong enough check?

Again, I'd just be practical here and do the check if it feels natural,
and if not, improve the comments so that someone modifying the code
would know where to look.


Regards,
    Jeff Davis




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Fix slot synchronization with two_phase decoding enabled
Next
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Fix slot synchronization with two_phase decoding enabled