Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits |
Date | |
Msg-id | 1d2e0f4a-6ab8-10f1-c177-af97475df001@postgrespro.ru Whole thread Raw |
In response to | Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
List | pgsql-hackers |
On 30.10.2020 03:42, Tomas Vondra wrote: > Hi, > > I might be somewhat late to the party, but I've done a bit of > benchmarking too ;-) I used TPC-H data from a 100GB test, and tried > different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on > current master and with the patch. > > So I don't really observe any measurable slowdowns in the COPY part (in > fact there seems to be a tiny speedup, but it might be just noise). In > the VACUUM part, there's clear speedup when the data was already frozen > by COPY (Yes, those are zeroes, because it took less than 1 second.) > > So that looks pretty awesome, I guess. > > For the record, these tests were run on a server with NVMe SSD, so > hopefully reliable and representative numbers. > Thank you for the review. > A couple minor comments about the code: > > 2) I find the "if (all_frozen_set)" block a bit strange. It's a matter > of personal preference, but I'd just use a single level of nesting, i.e. > something like this: > > /* if everything frozen, the whole page has to be visible */ > Assert(!(all_frozen_set && !PageIsAllVisible(page))); > > /* > * If we've frozen everything on the page, and if we're already > * holding pin on the vmbuffer, record that in the visibilitymap. > * If we're not holding the pin, it's OK to skip this - it's just > * an optimization. > * > * It's fine to use InvalidTransactionId here - this is only used > * when HEAP_INSERT_FROZEN is specified, which intentionally > * violates visibility rules. > */ > if (all_frozen_set && > visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer)) > visibilitymap_set(...); > > IMHO it's easier to read, but YMMV. I've also reworded the comment a bit > to say what we're doing etc. And I've moved the comment from inside the > if block into the main comment - that was making it harder to read too. > I agree that it's a matter of taste. I've updated comments and left nesting unchanged to keep assertions simple. > > 3) I see RelationGetBufferForTuple does this: > > /* > * This is for COPY FREEZE needs. If page is empty, > * pin vmbuffer to set all_frozen bit > */ > if ((options & HEAP_INSERT_FROZEN) && > (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0)) > visibilitymap_pin(relation, BufferGetBlockNumber(buffer), > vmbuffer); > > so is it actually possible to get into the (all_frozen_set) without > holding a pin on the visibilitymap? I haven't investigated this so > maybe there are other ways to get into that code block. But the new > additions to hio.c get the pin too. > I was thinking that GetVisibilityMapPins() can somehow unset the pin. I gave it a second look. And now I don't think it's possible to get into this code block without a pin. So, I converted this check into an assertion. > > 4) In heap_xlog_multi_insert we now have this: > > if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) > PageClearAllVisible(page); > if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET) > PageSetAllVisible(page); > > IIUC it makes no sense to have both flags at the same time, right? So > what about adding > > Assert(!(XLH_INSERT_ALL_FROZEN_SET && > XLH_INSERT_ALL_VISIBLE_CLEARED)); > > to check that? > Agree. I placed this assertion to the very beginning of the function. It also helped to simplify the code a bit. I also noticed, that we were not updating visibility map for all_frozen from heap_xlog_multi_insert. Fixed. > > I wonder what to do about the heap_insert - I know there are concerns it > would negatively impact regular insert, but is it really true? I suppose > this optimization would be valuable even for cases where multi-insert is > not possible. > Do we have something like INSERT .. FREEZE? I only see TABLE_INSERT_FROZEN set for COPY FREEZE and for matview operations. Can you explain, what use-case are we trying to optimize by extending this patch to heap_insert()? The new version is attached. I've also fixed a typo in the comment by Tatsuo Ishii suggestion. Also, I tested this patch with replication and found no issues. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: