Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers

From Ibrar Ahmed
Subject Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date
Msg-id CALtqXTfLBVf=y=LZ=z7tWELq5m+L4sm7z1txB1QkMgMC=koW4g@mail.gmail.com
Whole thread Raw
In response to Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Hamid Akhtar <hamid.akhtar@gmail.com>)
Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers


On Mon, Aug 3, 2020 at 2:29 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 31.07.2020 23:28, Robert Haas wrote:
> On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
>> Questions from the first review pass:
>>
>> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
>> implied by XLH_INSERT_ALL_FROZEN_SET.
> I agree that it looks unnecessary to have two separate bits.
>
>> 2) What does this comment mean? Does XXX refers to the lsn comparison?
>> Since it
>> is definitely necessary to update the VM.
>>
>> +             * XXX: This seems entirely unnecessary?
>> +             *
>> +             * FIXME: Theoretically we should only do this after we've
>> +             * modified the heap - but it's safe to do it here I think,
>> +             * because this means that the page previously was empty.
>> +             */
>> +            if (lsn > PageGetLSN(vmpage))
>> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
>> vmbuffer,
>> +                                  InvalidTransactionId, vmbits);
> I wondered about that too. The comment which precedes it was, I
> believe, originally written by me, and copied here from
> heap_xlog_visible(). But it's not clear very good practice to just
> copy the comment like this. If the same logic applies, the code should
> say that we're doing the same thing here as in heap_xlog_visible() for
> consistency, or some such thing; after all, that's the primary place
> where that happens. But it looks like the XXX might have been added by
> a second person who thought that we didn't need this logic at all, and
> the FIXME by a third person who thought it was in the wrong place, so
> the whole thing is really confusing at this point.
>
> I'm pretty worried about this, too:
>
> +             * FIXME: setting recptr here is a dirty dirty hack, to prevent
> +             * visibilitymap_set() from WAL logging.
>
> That is indeed a dirty hack, and something needs to be done about it.
>
> I wonder if it was really all that smart to try to make the
> HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> records to mark it all-visible afterwards, but I don't see any reason
> why this can't be made to work. It needs substantially more polishing
> than it's had, though, I think.
>
New version of the patch is in the attachment.

New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other
places.

It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.

I haven't tested performance yet, though. Maybe after tests, I'll bring
some optimizations back.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Here are some performance results with a patched and unpatched master branch. 
The table used for the test contains three columns (integer, text, varchar). 
The total number of rows is 10000000 in total. 

Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 9069.432 ms vacuum; 2567.961ms
COPY: 9004.533 ms vacuum: 2553.075ms
COPY: 8832.422 ms vacuum: 2540.742ms

Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 10031.723 ms vacuum: 127.524 ms
COPY: 9985.109  ms vacuum: 39.953 ms
COPY: 9283.373  ms vacuum: 37.137 ms

Time to take the copy slightly increased but the vacuum time significantly decrease.  

--
Ibrar Ahmed

pgsql-hackers by date:

Previous
From: mohand oubelkacem makhoukhene
Date:
Subject: Implement a new data type
Next
From: Tom Lane
Date:
Subject: Re: Implement a new data type