Re: [PATCHES] Post-special page storage TDE support - Mailing list pgsql-hackers
From | David Christensen |
---|---|
Subject | Re: [PATCHES] Post-special page storage TDE support |
Date | |
Msg-id | CAOxo6XKgNkMPfWZPJYeFgd-qTFxBBAv1QbvySXDLjY7Sr-6KLw@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCHES] Post-special page storage TDE support (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: [PATCHES] Post-special page storage TDE support
|
List | pgsql-hackers |
Hi Matthias, > Did you read the related thread with related discussion from last June, "Re: better page-level checksums" [0]? In thatI argued that space at the end of a page is already allocated for the AM, and that reserving variable space at the endof the page for non-AM usage is wasting the AM's performance potential. Yes, I had read parts of that thread among others, but have given it a re-read. I can see the point you're making here, and agree that if we can allocate between pd_special and pd_upper that could make sense. I am a little unclear as to what performance impacts for the AM there would be if this additional space were ahead or behind the page special area; it seems like if this is something that needs to live on the page *somewhere* just being aligned correctly would be sufficient from the AM's standpoint. Considering that I am trying to make this have zero storage impact if these features are not active, the impact on a cluster with no additional features would be moot from a storage perspective, no? > Apart from that: Is this variable-sized 'metadata' associated with smgr infrastructure only, or is it also available forAM features? If not; then this is a strong -1. The amount of tasks smgr needs to do on a page is generally much less thanthe amount of tasks an AM needs to do; so in my view the AM has priority in prime page real estate, not smgr or relatedinfrastructure. I will confess to a slightly wobbly understanding of the delineation of responsibility here. I was under the impression that by modifying any consumer of PageHeaderData this would be sufficient to cover all AMs for the types of cluster-wide options we'd be concerned about (say extended checksums, multiple page encryption schemes, or other per-page information we haven't yet anticipated). Reading smgr/README and the various access/*/README has not made the distinction clear to me yet. > re: PageFeatures > I'm not sure I understand the goal, nor the reasoning. Shouldn't this be part of the storage manager (smgr) implementation/ can't this be part of the smgr of the relation? For at least the feature cases I'm anticipating, this would apply to any disk page that may have user data, set (at least initially) at initdb time, so should apply to any pages in the cluster, regardless of AM. > re: use of pd_checksum > I mentioned this in the above-mentioned thread too, in [1], that we could use pd_checksum as an extra area marker for thisstorage-specific data, which would be located between pd_upper and pd_special. I do think that we could indeed use this as an additional in-page pointer, but at least for this version was keeping things backwards-compatible. Peter G (I think) also made some good points about how to include the various status bits on the page somehow in terms of making a page completely self-contained. > Re: patch contents > > 0001: > >+ specialSize = MAXALIGN(specialSize) + reserved_page_size; > > This needs to be aligned, so MAXALIGN(specialSize + reserved_page_size), or an assertion that reserved_page_size is MAXALIGNED,would be better. It is currently aligned via the space calculation return value but agree that folding it into an assert or reworking it explicitly is clearer. > > PageValidateSpecialPointer(Page page) > > { > > Assert(page); > > - Assert(((PageHeader) page)->pd_special <= BLCKSZ); > > + Assert((((PageHeader) page)->pd_special - reserved_page_size) <= BLCKSZ); > > This check is incorrect. With your code it would allow pd_special past the end of the block. If you want to put the reserved_space_sizeeffectively inside the special area, this check should instead be: > > + Assert(((PageHeader) page)->pd_special <= (BLCKSZ - reserved_page_size)); > > Or, equally valid > > + Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ); Yup, I think I inverted my logic there; thanks. > > + * +-------------+-----+------------+-----------------+ > > + * | ... tuple2 tuple1 | "special space" | "reserved" | > > + * +-------------------+------------+-----------------+ > > Could you fix the table display if / when you revise the patchset? It seems to me that the corners don't line up with thecolumn borders. Sure thing. > 0002: > > Make the output of "select_views" test stable > > Changing the reserved_page_size has resulted in non-stable results for this test. > > This makes sense, what kind of instability are we talking about? Are there different results for runs with the same binary,or is this across compilations? When running with the same compilation/initdb settings, the test results are stable, but differ depending what options you chose, so `make installcheck` output will fail when testing a cluster with different options vs upstream HEAD without these patches, etc. > 0003 and up were not yet reviewed in depth. Thanks, I appreciate the feedback so far.
pgsql-hackers by date:
Previous
From: Andres FreundDate:
Subject: Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Next
From: Justin PryzbyDate:
Subject: Re: explain_regress, explain(MACHINE), and default to explain(BUFFERS) (was: BUFFERS enabled by default in EXPLAIN (ANALYZE))