Thread: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:
Hi,

Jeff Janes raised an issue [1] about PD_ALL_VISIBLE not being set correctly while loading data via COPY FREEZE and had also posted a draft patch.

I now have what I think is a more complete patch. I took a slightly different approach and instead of setting PD_ALL_VISIBLE bit initially and then not clearing it during insertion, we now recheck the page for all-frozen, all-visible tuples just before switching to a new page. This allows us to then also mark set the visibility map bit, like we do in vacuumlazy.c

Some special treatment is required to handle the last page before bulk insert it shutdown. We could have chosen not to do anything special for the last page and let it remain unfrozen, but I thought it makes sense to take that extra effort so that we can completely freeze the table and set all VM bits at the end of COPY FREEZE.

Let me know what you think.

Thanks,
Pavan  


--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Kuntal Ghosh
Date:
Hello Pavan,

Thank you for the patch. It seems to me that while performing COPY
FREEZE, if we've copied tuples in a previously emptied page, we can
set the PageSetAllVisible(page) in heap_muli_insert only. Something
like,

bool init = (ItemPointerGetOffsetNumber(&(heaptuples[ndone]->t_self))
== FirstOffsetNumber &&
         PageGetMaxOffsetNumber(page) == FirstOffsetNumber + nthispage - 1);
if (init && (options & HEAP_INSERT_FROZEN))
 PageSetAllVisible(page);

Later, you can skip the pages for
CheckAndSetAllVisibleBulkInsertState() where PD_ALL_VISIBLE flag is
already set. Do you think it's correct?


On Thu, Feb 21, 2019 at 11:35 AM Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
> Hi,
>
> Jeff Janes raised an issue [1] about PD_ALL_VISIBLE not being set correctly while loading data via COPY FREEZE and
hadalso posted a draft patch. 
>
> I now have what I think is a more complete patch. I took a slightly different approach and instead of setting
PD_ALL_VISIBLEbit initially and then not clearing it during insertion, we now recheck the page for all-frozen,
all-visibletuples just before switching to a new page. This allows us to then also mark set the visibility map bit,
likewe do in vacuumlazy.c 
>
> Some special treatment is required to handle the last page before bulk insert it shutdown. We could have chosen not
todo anything special for the last page and let it remain unfrozen, but I thought it makes sense to take that extra
effortso that we can completely freeze the table and set all VM bits at the end of COPY FREEZE. 
>
> Let me know what you think.
>
> Thanks,
> Pavan
>
> [1] https://www.postgresql.org/message-id/CAMkU%3D1w3osJJ2FneELhhNRLxfZitDgp9FPHee08NT2FQFmz_pQ%40mail.gmail.com
>
> --
>  Pavan Deolasee                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Simon Riggs
Date:
On Thu, 21 Feb 2019 at 15:38, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
 
Thank you for the patch. It seems to me that while performing COPY
FREEZE, if we've copied tuples in a previously emptied page

There won't be any previously emptied pages because of the pre-conditions required for FREEZE. 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Kuntal Ghosh
Date:
On Tue, Feb 26, 2019 at 6:46 PM Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Thu, 21 Feb 2019 at 15:38, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
>
>>
>> Thank you for the patch. It seems to me that while performing COPY
>> FREEZE, if we've copied tuples in a previously emptied page
>
>
> There won't be any previously emptied pages because of the pre-conditions required for FREEZE.
>
Right, I missed that part. Thanks for pointing that out. But, this
optimization is still possible for copying frozen tuples in a newly
created page, right? If current backend allocates a new page, copies a
bunch of frozen tuples in that page, it can set the PD_ALL_VISIBLE in
the same operation.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Jeff Janes
Date:
On Thu, Feb 21, 2019 at 1:05 AM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
Hi,

Jeff Janes raised an issue [1] about PD_ALL_VISIBLE not being set correctly while loading data via COPY FREEZE and had also posted a draft patch.

I now have what I think is a more complete patch. I took a slightly different approach and instead of setting PD_ALL_VISIBLE bit initially and then not clearing it during insertion, we now recheck the page for all-frozen, all-visible tuples just before switching to a new page. This allows us to then also mark set the visibility map bit, like we do in vacuumlazy.c

Some special treatment is required to handle the last page before bulk insert it shutdown. We could have chosen not to do anything special for the last page and let it remain unfrozen, but I thought it makes sense to take that extra effort so that we can completely freeze the table and set all VM bits at the end of COPY FREEZE.

Let me know what you think.

Hi Pavan, thanks for picking this up.

After doing a truncation and '\copy ... with (freeze)' of a table with long data, I find that the associated toast table has a handful of unfrozen blocks.  I don't know if that is an actual problem, but it does seem a bit odd, and thus suspicious.

perl -le 'print join "", map rand(), 1..500 foreach 1..1000000' > foo

create table foobar1 (x text);
begin;
truncate foobar1;
\copy foobar1 from foo with (freeze)
commit;
select all_visible,all_frozen,pd_all_visible, count(*) from pg_visibility('pg_toast.pg_toast_25085') group by 1,2,3;
 all_visible | all_frozen | pd_all_visible |  count  
-------------+------------+----------------+---------
 f           | f          | f              |      18
 t           | t          | t              | 530,361
(2 rows)

Cheers,

Jeff 

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Wed, Feb 27, 2019 at 7:05 AM Jeff Janes <jeff.janes@gmail.com> wrote:


After doing a truncation and '\copy ... with (freeze)' of a table with long data, I find that the associated toast table has a handful of unfrozen blocks.  I don't know if that is an actual problem, but it does seem a bit odd, and thus suspicious.


Hi Jeff, thanks for looking at it and the test. I can reproduce the problem and quite curiously block number 1 and then every 32672th block is getting skipped.

postgres=# select * from pg_visibility('pg_toast.pg_toast_16384') where all_visible = 'f';
 blkno  | all_visible | all_frozen | pd_all_visible 
--------+-------------+------------+----------------
      1 | f           | f          | f
  32673 | f           | f          | f
  65345 | f           | f          | f
  98017 | f           | f          | f
 130689 | f           | f          | f
 163361 | f           | f          | f
 <snip>

Having investigated this a bit, I see that a relcache invalidation arrives after 1st and then after every 32672th block is filled. That clears the rel->rd_smgr field and we lose the information about the saved target block. The code then moves to extend the relation again and thus skips the previously less-than-half filled block, losing the free space in that block.

postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_16384', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 1/15B37748 |        0 |     4 |    40 |    64 |    8192 |     8192 |       4 |         0
(1 row)

postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_16384', 1));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 1/15B39A28 |        0 |     4 |    28 |  7640 |    8192 |     8192 |       4 |         0
(1 row)

postgres=# SELECT * FROM page_header(get_raw_page('pg_toast.pg_toast_16384', 2));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid 
------------+----------+-------+-------+-------+---------+----------+---------+-----------
 1/15B3BE08 |        0 |     4 |    40 |    64 |    8192 |     8192 |       4 |         0
(1 row)

So the block 1 has a large amount of free space (upper - lower), which never gets filled.

I am not yet sure what causes the relcache invalidation at regular intervals. But if I have to guess, it could be because of a new VM (or FSM?) page getting allocated. I am bit puzzled because this issue seems to only occur with toast tables since I tested the patch while writing it on a regular table and did not see any block remaining unfrozen. I tested only upto 450 blocks, but that shouldn't matter because with your test, we see the problem with block 1 as well. So something to look into in more detail.

While we could potentially fix this by what you'd done in the original patch and what Kuntal also suggested, i.e. by setting the PD_ALL_VISIBLE bit during page initialisation itself, I am a bit circumspect about that approach for two reasons:

1. It requires us to then add extra logic to avoid clearing the bit during insertions
2. It requires us to also update the VM bit during page init or risk having divergent views on the page-level bit and the VM bit.

And even if we do that, this newly discovered problem of less-than-half filled intermediate blocks remain. I wonder if we should instead track the last used block in BulkInsertState and if the relcache invalidation flushes smgr, start inserting again from the last saved block. In fact, we already track the last used buffer in BulkInsertState and that's enough to know the last used block.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Masahiko Sawada
Date:
On Thu, Feb 21, 2019 at 3:05 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
> Hi,
>
> Jeff Janes raised an issue [1] about PD_ALL_VISIBLE not being set correctly while loading data via COPY FREEZE and
hadalso posted a draft patch. 
>
> I now have what I think is a more complete patch. I took a slightly different approach and instead of setting
PD_ALL_VISIBLEbit initially and then not clearing it during insertion, we now recheck the page for all-frozen,
all-visibletuples just before switching to a new page. This allows us to then also mark set the visibility map bit,
likewe do in vacuumlazy.c 

I might be missing something but why do we need to recheck whether
each pages is all-frozen after insertion? I wonder if we can set
all-frozen without checking all tuples again in this case.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:

On Mon, Mar 11, 2019 at 1:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:


I might be missing something but why do we need to recheck whether
each pages is all-frozen after insertion? I wonder if we can set
all-frozen without checking all tuples again in this case.

It's possible that the user may have inserted unfrozen rows (via regular INSERT or COPY without FREEZE option) before inserting frozen rows. So we can't set the all-visible/all-frozen flag unconditionally. I also find it safer to do an explicit check to ensure we never accidentally mark a page as all-frozen. Since the check is performed immediately after the page becomes full and only once per page, there shouldn't be any additional IO cost and the check should be quite fast. 

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Masahiko Sawada
Date:
On Tue, Mar 12, 2019 at 4:54 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
>
> On Mon, Mar 11, 2019 at 1:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>>
>> I might be missing something but why do we need to recheck whether
>> each pages is all-frozen after insertion? I wonder if we can set
>> all-frozen without checking all tuples again in this case.
>
>
> It's possible that the user may have inserted unfrozen rows (via regular INSERT or COPY without FREEZE option) before
insertingfrozen rows. 

I think that since COPY FREEZE can be executed only when the table is
created or truncated within the transaction other users cannot insert
any rows during COPY FREEZE.

> So we can't set the all-visible/all-frozen flag unconditionally. I also find it safer to do an explicit check to
ensurewe never accidentally mark a page as all-frozen. Since the check is performed immediately after the page becomes
fulland only once per page, there shouldn't be any additional IO cost and the check should be quite fast. 

I'd suggest to measure performance overhead. I can imagine one use
case of COPY FREEZE is the loading a very large table. Since in
addition to set visibility map bits this patch could scan a very large
table I'm concerned that how much performance is degraded by this
patch.
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Wed, Mar 13, 2019 at 11:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:


I think that since COPY FREEZE can be executed only when the table is
created or truncated within the transaction other users cannot insert
any rows during COPY FREEZE.


Right. But the truncating transaction can insert unfrozen rows into the table before inserting more rows via COPY FREEZE.

postgres=# CREATE EXTENSION pageinspect ;
CREATE EXTENSION
postgres=# BEGIN;
BEGIN
postgres=# TRUNCATE testtab ;
TRUNCATE TABLE
postgres=# INSERT INTO testtab VALUES (100, 200);
INSERT 0 1
postgres=# COPY testtab FROM STDIN WITH (FREEZE);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 2
>> 2 3
>> \.
COPY 2
postgres=# COMMIT;

postgres=# SELECT lp, to_hex(t_infomask) FROM heap_page_items(get_raw_page('testtab', 0));
 lp | to_hex 
----+--------
  1 | 800
  2 | b00
  3 | b00
(3 rows)

The first row in inserted by regular insert and it's not frozen. The next 2 are frozen. We can't mark such as page all-visible, all-frozen.
 

I'd suggest to measure performance overhead. I can imagine one use
case of COPY FREEZE is the loading a very large table. Since in
addition to set visibility map bits this patch could scan a very large
table I'm concerned that how much performance is degraded by this
patch.

Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is caused by having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we may do some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan that we are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU cost, but not anything in terms of IO cost.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Masahiko Sawada
Date:
On Thu, Mar 14, 2019 at 5:17 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>
>
>
> On Wed, Mar 13, 2019 at 11:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>>
>> I think that since COPY FREEZE can be executed only when the table is
>> created or truncated within the transaction other users cannot insert
>> any rows during COPY FREEZE.
>>
>
> Right. But the truncating transaction can insert unfrozen rows into the table before inserting more rows via COPY
FREEZE.
>
> postgres=# CREATE EXTENSION pageinspect ;
> CREATE EXTENSION
> postgres=# BEGIN;
> BEGIN
> postgres=# TRUNCATE testtab ;
> TRUNCATE TABLE
> postgres=# INSERT INTO testtab VALUES (100, 200);
> INSERT 0 1
> postgres=# COPY testtab FROM STDIN WITH (FREEZE);
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
> >> 1 2
> >> 2 3
> >> \.
> COPY 2
> postgres=# COMMIT;
>
> postgres=# SELECT lp, to_hex(t_infomask) FROM heap_page_items(get_raw_page('testtab', 0));
>  lp | to_hex
> ----+--------
>   1 | 800
>   2 | b00
>   3 | b00
> (3 rows)
>
> The first row in inserted by regular insert and it's not frozen. The next 2 are frozen. We can't mark such as page
all-visible,all-frozen. 

Understood. Thank you for explanation!

>
>>
>>
>> I'd suggest to measure performance overhead. I can imagine one use
>> case of COPY FREEZE is the loading a very large table. Since in
>> addition to set visibility map bits this patch could scan a very large
>> table I'm concerned that how much performance is degraded by this
>> patch.
>
>
> Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is
causedby having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we
maydo some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan
thatwe are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU
cost,but not anything in terms of IO cost. 

Agreed. I had been misunderstanding this patch. The page scan during
COPY FREEZE is necessary and it's very cheaper than doing in the first
vacuum.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
David Steele
Date:
Hi Pavan,

On 3/14/19 2:20 PM, Masahiko Sawada wrote:
> On Thu, Mar 14, 2019 at 5:17 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
>>
>> Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is
causedby having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we
maydo some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan
thatwe are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU
cost,but not anything in terms of IO cost.
 
> 
> Agreed. I had been misunderstanding this patch. The page scan during
> COPY FREEZE is necessary and it's very cheaper than doing in the first
> vacuum.

I have removed Ibrar as a reviewer since there has been no review from 
them in three weeks, and too encourage others to have a look.

Regards,
-- 
-David
david@pgmasters.net


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Thu, Mar 14, 2019 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>
>
> Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is caused by having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we may do some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan that we are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU cost, but not anything in terms of IO cost.

Agreed. I had been misunderstanding this patch. The page scan during
COPY FREEZE is necessary and it's very cheaper than doing in the first
vacuum.

Thanks for agreeing to the need of this bug fix. I ran some simple tests anyways and here are the results.

The test consists of a simple table with three columns, two integers and one char(100). I then ran COPY (FREEZE), loading 7M rows, followed by a VACUUM. The total size of the raw data is about 800MB and the table size in Postgres is just under 1GB. The results for 3 runs in milliseconds are:

Master:
 COPY FREEZE: 40243.725   40309.675    40783.836
 VACUUM: 2685.871  2517.445    2508.452 

Patched:
 COPY FREEZE: 40942.410  40495.303   40638.075
 VACUUM: 25.067  35.793   25.390

So there is a slight increase in the time to run COPY FREEZE, but a significant reduction in time to VACUUM the table. The benefits will only go up if the table is vacuumed much  later when most of the pages are already written to the disk and removed from shared buffers and/or kernel cache.

I hope this satisfies your doubts regarding performance implications of the patch.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Darafei Praliaskouski
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            not tested

This patch is particularly helpful in processing OpenStreetMap Data in PostGIS.
OpenStreetMap is imported as a stream of 300-900 (depending on settings) gigabytes, that are needing a VACUUM after a
COPYFREEZE.
 
With this patch, the first and usually the last transforming query is performed much faster after initial load.

I have read this patch and have no outstanding comments on it.
Pavan Deolasee demonstrates the expected speed improvement.

The new status of this patch is: Ready for Committer

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Masahiko Sawada
Date:
On Thu, Mar 21, 2019 at 11:27 PM Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
>
>
> On Thu, Mar 14, 2019 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>> >
>> >
>> > Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is
causedby having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we
maydo some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan
thatwe are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU
cost,but not anything in terms of IO cost. 
>>
>> Agreed. I had been misunderstanding this patch. The page scan during
>> COPY FREEZE is necessary and it's very cheaper than doing in the first
>> vacuum.
>
>
> Thanks for agreeing to the need of this bug fix. I ran some simple tests anyways and here are the results.
>
> The test consists of a simple table with three columns, two integers and one char(100). I then ran COPY (FREEZE),
loading7M rows, followed by a VACUUM. The total size of the raw data is about 800MB and the table size in Postgres is
justunder 1GB. The results for 3 runs in milliseconds are: 
>
> Master:
>  COPY FREEZE: 40243.725   40309.675    40783.836
>  VACUUM: 2685.871  2517.445    2508.452
>
> Patched:
>  COPY FREEZE: 40942.410  40495.303   40638.075
>  VACUUM: 25.067  35.793   25.390
>
> So there is a slight increase in the time to run COPY FREEZE, but a significant reduction in time to VACUUM the
table.The benefits will only go up if the table is vacuumed much  later when most of the pages are already written to
thedisk and removed from shared buffers and/or kernel cache. 
>
> I hope this satisfies your doubts regarding performance implications of the patch.

Thank you for the performance testing, that's a great improvement!

I've looked at the patch and have comments and questions.

+    /*
+     * While we are holding the lock on the page, check if all tuples
+     * in the page are marked frozen at insertion. We can safely mark
+     * such page all-visible and set visibility map bits too.
+     */
+    if (CheckPageIsAllFrozen(relation, buffer))
+        PageSetAllVisible(page);
+
+    MarkBufferDirty(buffer);

Maybe we don't need to mark the buffer dirty if the page is not set all-visible.

-----
+    if (PageIsAllVisible(page))
+    {
+        uint8       vm_status = visibilitymap_get_status(relation,
+                targetBlock, &myvmbuffer);
+        uint8       flags = 0;
+
+        /* Set the VM all-frozen bit to flag, if needed */
+        if ((vm_status & VISIBILITYMAP_ALL_VISIBLE) == 0)
+            flags |= VISIBILITYMAP_ALL_VISIBLE;
+        if ((vm_status & VISIBILITYMAP_ALL_FROZEN) == 0)
+            flags |= VISIBILITYMAP_ALL_FROZEN;
+
+        Assert(BufferIsValid(myvmbuffer));
+        if (flags != 0)
+            visibilitymap_set(relation, targetBlock, buffer, InvalidXLogRecPtr,
+                    myvmbuffer, InvalidTransactionId, flags);

Since CheckPageIsAllFrozen() is used only when inserting frozen tuples
CheckAndSetPageAllVisible() seems to be implemented for the same
situation. If we have CheckAndSetPageAllVisible() for only this
situation we would rather need to check that the VM status of the page
should be 0 and then set two flags to the page? The 'flags' will
always be (VISIBILITYMAP_ALL_FROZEN | VISIBILITYMAP_ALL_VISIBLE) in
copy freeze case. I'm confused that this function has both code that
assumes some special situations and code that can be used in generic
situations.

-----
Perhaps we can add some tests for this feature to pg_visibility module.


Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:

On Fri, Mar 22, 2019 at 12:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
I've looked at the patch and have comments and questions.

+    /*
+     * While we are holding the lock on the page, check if all tuples
+     * in the page are marked frozen at insertion. We can safely mark
+     * such page all-visible and set visibility map bits too.
+     */
+    if (CheckPageIsAllFrozen(relation, buffer))
+        PageSetAllVisible(page);
+
+    MarkBufferDirty(buffer);

Maybe we don't need to mark the buffer dirty if the page is not set all-visible.


Yeah, makes sense. Fixed.
 
 If we have CheckAndSetPageAllVisible() for only this
situation we would rather need to check that the VM status of the page
should be 0 and then set two flags to the page? The 'flags' will
always be (VISIBILITYMAP_ALL_FROZEN | VISIBILITYMAP_ALL_VISIBLE) in
copy freeze case. I'm confused that this function has both code that
assumes some special situations and code that can be used in generic
situations.

If a second COPY FREEZE is run within the same transaction and if starts inserting into the page used by the previous COPY FREEZE, then the page will already be marked all-visible/all-frozen. So we can skip repeating the operation again. While it's quite unlikely that someone will do that and I can't think of a situation where only one of those flags will be set, I don't see a harm in keeping the code as is. This code is borrowed from vacuumlazy.c and at some point we can even move it to some common location.


Perhaps we can add some tests for this feature to pg_visibility module.


That's a good idea. Please see if the tests included in the attached patch are enough.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Wed, Mar 27, 2019 at 9:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The patch looks good to me. There is no comment from me.


Thanks for your review! Updated patch attached since patch failed to apply after recent changes in the master.

Thanks,
Pavan
 
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
Hi,

I've been looking at this patch for a while, and it seems pretty much RFC,
so barring objections I'll take care of that once I do a bit more testing
and review. Unless someone else wants to take care of that.

FWIW I wonder if we should add the code for partitioned tables to
CopyFrom, considering that's unsupported and so can't be tested etc. It's
not a huge amount of code, of course.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-03 10:19:17 +0530, Pavan Deolasee wrote:
> diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
> index c1fd7b78ce..09df70a3ac 100644
> --- a/src/backend/commands/copy.c
> +++ b/src/backend/commands/copy.c
> @@ -2833,6 +2833,15 @@ CopyFrom(CopyState cstate)
>                      !has_instead_insert_row_trig &&
>                      resultRelInfo->ri_FdwRoutine == NULL;
>  
> +                /*
> +                 * Note: As of PG12, COPY FREEZE is not supported on
> +                 * partitioned table. Nevertheless have this check in place so
> +                 * that we do the right thing if it ever gets supported.
> +                 */
> +                if (ti_options & TABLE_INSERT_FROZEN)
> +                    CheckAndSetAllVisibleBulkInsertState(resultRelInfo->ri_RelationDesc,
> +                            bistate);
> +
>                  /*
>                   * We'd better make the bulk insert mechanism gets a new
>                   * buffer when the partition being inserted into changes.
> @@ -3062,6 +3071,15 @@ CopyFrom(CopyState cstate)
>                                  firstBufferedLineNo);
>      }
>  
> +    /*
> +     * If we are inserting frozen tuples, check if the last page used can also
> +     * be marked as all-visible and all-frozen. This ensures that a table can
> +     * be fully frozen when the data is loaded.
> +     */
> +    if (ti_options & TABLE_INSERT_FROZEN)
> +        CheckAndSetAllVisibleBulkInsertState(resultRelInfo->ri_RelationDesc,
> +                bistate);
> +
>      /* Done, clean up */
>      error_context_stack = errcallback.previous;

I'm totally not OK with this from a layering
POV. CheckAndSetAllVisibleBulkInsertState is entirely heap specific
(without being named such), whereas all the heap specific bits are
getting moved below tableam.

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Alvaro Herrera
Date:
On 2019-Apr-04, Andres Freund wrote:

> I'm totally not OK with this from a layering
> POV. CheckAndSetAllVisibleBulkInsertState is entirely heap specific
> (without being named such), whereas all the heap specific bits are
> getting moved below tableam.

This is a fair complaint, but on the other hand the COPY changes for
table AM are still being developed, so there's no ground on which to
rebase this patch.  Do you have a timeline on getting the COPY one
committed?

I think it's fair to ask the RMT for an exceptional extension of a
couple of working days for this patch.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-04 16:15:54 -0300, Alvaro Herrera wrote:
> On 2019-Apr-04, Andres Freund wrote:
> 
> > I'm totally not OK with this from a layering
> > POV. CheckAndSetAllVisibleBulkInsertState is entirely heap specific
> > (without being named such), whereas all the heap specific bits are
> > getting moved below tableam.
> 
> This is a fair complaint, but on the other hand the COPY changes for
> table AM are still being developed, so there's no ground on which to
> rebase this patch.  Do you have a timeline on getting the COPY one
> committed?

~2h. Just pondering the naming of some functions etc.  Don't think
there's a large interdependency though.

But even if tableam weren't committed, I'd still argue that it's
structurally done wrong in the patch right now.  FWIW, I actually think
this whole approach isn't quite right - this shouldn't be done as a
secondary action after we'd already inserted, with a separate
lock-unlock cycle etc.

Also, how is this code even close to correct?
CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
there's no WAL logging? Without even a comment arguing why that's OK (I
don't think it is)?

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> Also, how is this code even close to correct?
> CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> there's no WAL logging? Without even a comment arguing why that's OK (I
> don't think it is)?

Peter Geoghegan just reminded me over IM that there's actually logging
inside log_heap_visible(), called from visibilitymap_set(). Still lacks
a critical section though.

I still think this is the wrong architecture.

Greetings,

Andres Freund

PS: We're going to have to revamp visibilitymap_set() soon-ish - the
fact that it directly calls heap routines inside is bad, means that
additional AMs e.g. zheap has to reimplement that routine.



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Alvaro Herrera
Date:
On 2019-Apr-04, Andres Freund wrote:

> On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> > Also, how is this code even close to correct?
> > CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> > there's no WAL logging? Without even a comment arguing why that's OK (I
> > don't think it is)?
> 
> Peter Geoghegan just reminded me over IM that there's actually logging
> inside log_heap_visible(), called from visibilitymap_set(). Still lacks
> a critical section though.

Hmm, isn't there already a critical section in visibilitymap_set itself?

> I still think this is the wrong architecture.

Hmm.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:
> On 2019-Apr-04, Andres Freund wrote:
> 
> > On 2019-04-04 12:23:08 -0700, Andres Freund wrote:
> > > Also, how is this code even close to correct?
> > > CheckAndSetPageAllVisible() modifies the buffer in a crucial way, and
> > > there's no WAL logging? Without even a comment arguing why that's OK (I
> > > don't think it is)?
> > 
> > Peter Geoghegan just reminded me over IM that there's actually logging
> > inside log_heap_visible(), called from visibilitymap_set(). Still lacks
> > a critical section though.
> 
> Hmm, isn't there already a critical section in visibilitymap_set itself?

There is, but the proposed code sets all visible on the page, and marks
the buffer dirty, before calling visibilitymap_set.

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:
> On 2019-Apr-04, Andres Freund wrote:
> > I still think this is the wrong architecture.
> 
> Hmm.

I think the right approach would be to do all of this in heap_insert and
heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
is specified, remember whether it is either currently empty, or is
already marked as all-visible. If previously empty, mark it as all
visible at the end. If already all visible, there's no need to change
that. If not yet all-visible, no need to do anything, since it can't
have been inserted with COPY FREEZE.  Do you see any problem doing it
that way?

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:
Hi,

On Fri, Apr 5, 2019 at 9:05 AM Andres Freund <andres@anarazel.de> wrote:


I think the right approach would be to do all of this in heap_insert and
heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
is specified, remember whether it is either currently empty, or is
already marked as all-visible. If previously empty, mark it as all
visible at the end. If already all visible, there's no need to change
that. If not yet all-visible, no need to do anything, since it can't
have been inserted with COPY FREEZE. 

We're doing roughly the same. If we are running INSERT_FROZEN, whenever we're about to switch to a new page, we check if the previous page should be marked all-frozen and do it that way. The special code in copy.c is necessary to take care of the last page which we don't get to handle in the regular code path.

Or are you suggesting that we don't even rescan the page for all-frozen tuples at the end and just simply mark it all-frozen at the start, when the first tuple is inserted and then don't touch the PD_ALL_VISIBLE/visibility map bit as we go on inserting more tuples in the page?

Anyways, if major architectural changes are required then it's probably too late to consider this for PG12, even though it's more of a bug fix and a candidate for back-patching too.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Fri, Apr 5, 2019 at 8:37 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-04-05 00:06:04 -0300, Alvaro Herrera wrote:

>
> Hmm, isn't there already a critical section in visibilitymap_set itself?

There is, but the proposed code sets all visible on the page, and marks
the buffer dirty, before calling visibilitymap_set.

How's it any different than what we're doing at vacuumlazy.c:1322? We set the page-level bit, mark the buffer dirty and then call visibilitymap_set(), all outside a critical section.

1300         /* mark page all-visible, if appropriate */
1301         if (all_visible && !all_visible_according_to_vm)
1302         {
1303             uint8       flags = VISIBILITYMAP_ALL_VISIBLE;
1304 
1305             if (all_frozen)
1306                 flags |= VISIBILITYMAP_ALL_FROZEN;
1307 
1308             /*
1309              * It should never be the case that the visibility map page is set
1310              * while the page-level bit is clear, but the reverse is allowed
1311              * (if checksums are not enabled).  Regardless, set the both bits
1312              * so that we get back in sync.
1313              *
1314              * NB: If the heap page is all-visible but the VM bit is not set,
1315              * we don't need to dirty the heap page.  However, if checksums
1316              * are enabled, we do need to make sure that the heap page is
1317              * dirtied before passing it to visibilitymap_set(), because it
1318              * may be logged.  Given that this situation should only happen in
1319              * rare cases after a crash, it is not worth optimizing.
1320              */
1321             PageSetAllVisible(page);
1322             MarkBufferDirty(buf);
1323             visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
1324                               vmbuffer, visibility_cutoff_xid, flags);
1325         } 

As the first para in that comment says, I thought it's ok for page-level bit to be set but the visibility bit to be clear, but not the vice versa. The proposed code does not introduce any  new behaviour AFAICS. But I might be missing something.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> I think the right approach would be to do all of this in heap_insert and
> heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> is specified, remember whether it is either currently empty, or is
> already marked as all-visible. If previously empty, mark it as all
> visible at the end. If already all visible, there's no need to change
> that. If not yet all-visible, no need to do anything, since it can't
> have been inserted with COPY FREEZE.  Do you see any problem doing it
> that way?

Do we want to add overhead to these hot-spot routines for this purpose?

            regards, tom lane



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-05 09:20:36 +0530, Pavan Deolasee wrote:
> On Fri, Apr 5, 2019 at 9:05 AM Andres Freund <andres@anarazel.de> wrote:
> > I think the right approach would be to do all of this in heap_insert and
> > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > is specified, remember whether it is either currently empty, or is
> > already marked as all-visible. If previously empty, mark it as all
> > visible at the end. If already all visible, there's no need to change
> > that. If not yet all-visible, no need to do anything, since it can't
> > have been inserted with COPY FREEZE.
> 
> 
> We're doing roughly the same. If we are running INSERT_FROZEN, whenever
> we're about to switch to a new page, we check if the previous page should
> be marked all-frozen and do it that way. The special code in copy.c is
> necessary to take care of the last page which we don't get to handle in the
> regular code path.

Well, it's not the same, because you need extra code from copy.c, extra
lock cycles, and extra WAL logging.


> Or are you suggesting that we don't even rescan the page for all-frozen
> tuples at the end and just simply mark it all-frozen at the start, when the
> first tuple is inserted and then don't touch the PD_ALL_VISIBLE/visibility
> map bit as we go on inserting more tuples in the page?

Correct. If done right that should be cheaper (no extra scans, less WAL
logging), without requiring some new dispatch logic from copy.c.


> Anyways, if major architectural changes are required then it's probably too
> late to consider this for PG12, even though it's more of a bug fix and a
> candidate for back-patching too.

Let's just see how bad it looks? I don't feel like we'd need to be super
strict about it. If it looks simple enough I'd e.g. be ok to merge this
soon after freeze, and backpatch around maybe 12.1 or such.

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-04 23:57:58 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think the right approach would be to do all of this in heap_insert and
> > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > is specified, remember whether it is either currently empty, or is
> > already marked as all-visible. If previously empty, mark it as all
> > visible at the end. If already all visible, there's no need to change
> > that. If not yet all-visible, no need to do anything, since it can't
> > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > that way?
> 
> Do we want to add overhead to these hot-spot routines for this purpose?

For heap_multi_insert I can't see it being a problem - it's only used
from copy.c, and the cost should be "smeared" over many tuples.  I'd
assume that compared with locking a page, WAL logging, etc, it'd not
even meaningfully show up for heap_insert. Especially because we already
have codepaths for options & HEAP_INSERT_FROZEN in
heap_prepare_insert(), and I'd assume those could be combined.

I think we should measure it, but I don't think that one or two
additional, very well predictd, branches are going to be measurable in
in those routines.

The patch, as implemented, has modifications in
RelationGetBufferForTuple(), that seems like they'd be more expensive.

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Darafei "Komяpa" Praliaskouski
Date:


On Fri, Apr 5, 2019 at 6:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> I think the right approach would be to do all of this in heap_insert and
> heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> is specified, remember whether it is either currently empty, or is
> already marked as all-visible. If previously empty, mark it as all
> visible at the end. If already all visible, there's no need to change
> that. If not yet all-visible, no need to do anything, since it can't
> have been inserted with COPY FREEZE.  Do you see any problem doing it
> that way?

Do we want to add overhead to these hot-spot routines for this purpose?

Sizing the overhead: workflows right now don't end with COPY FREEZE - you need another VACUUM to set maps. 
Anything that lets you skip that VACUUM (and faster than that VACUUM itself) is helping. You specifically asked for it to be skippable with FREEZE anyway.


--
Darafei Praliaskouski

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-05 08:38:34 +0300, Darafei "Komяpa" Praliaskouski wrote:
> On Fri, Apr 5, 2019 at 6:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> > Andres Freund <andres@anarazel.de> writes:
> > > I think the right approach would be to do all of this in heap_insert and
> > > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > > is specified, remember whether it is either currently empty, or is
> > > already marked as all-visible. If previously empty, mark it as all
> > > visible at the end. If already all visible, there's no need to change
> > > that. If not yet all-visible, no need to do anything, since it can't
> > > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > > that way?
> >
> > Do we want to add overhead to these hot-spot routines for this purpose?
> >
> 
> Sizing the overhead: workflows right now don't end with COPY FREEZE - you
> need another VACUUM to set maps.
> Anything that lets you skip that VACUUM (and faster than that VACUUM
> itself) is helping. You specifically asked for it to be skippable with
> FREEZE anyway.

Tom's point was that the routines I was suggesting to adapt above aren't
just used for COPY FREEZE.

Greetings,

Andres Freund



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-04 21:04:49 -0700, Andres Freund wrote:
> On 2019-04-04 23:57:58 -0400, Tom Lane wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > I think the right approach would be to do all of this in heap_insert and
> > > heap_multi_insert. Whenever starting to work on a page, if INSERT_FROZEN
> > > is specified, remember whether it is either currently empty, or is
> > > already marked as all-visible. If previously empty, mark it as all
> > > visible at the end. If already all visible, there's no need to change
> > > that. If not yet all-visible, no need to do anything, since it can't
> > > have been inserted with COPY FREEZE.  Do you see any problem doing it
> > > that way?
> > 
> > Do we want to add overhead to these hot-spot routines for this purpose?
> 
> For heap_multi_insert I can't see it being a problem - it's only used
> from copy.c, and the cost should be "smeared" over many tuples.  I'd
> assume that compared with locking a page, WAL logging, etc, it'd not
> even meaningfully show up for heap_insert. Especially because we already
> have codepaths for options & HEAP_INSERT_FROZEN in
> heap_prepare_insert(), and I'd assume those could be combined.
> 
> I think we should measure it, but I don't think that one or two
> additional, very well predictd, branches are going to be measurable in
> in those routines.
> 
> The patch, as implemented, has modifications in
> RelationGetBufferForTuple(), that seems like they'd be more expensive.

Here's a *prototype* patch for this.  It only implements what I
described for heap_multi_insert, not for plain inserts. I wanted to see
what others think before investing additional time.

I don't think it's possible to see the overhead of the changed code in
heap_multi_insert(), and probably - with less confidence - that it's
also going to be ok for heap_insert(). But we gotta measure that.


This avoids an extra WAL record for setting empty pages to all visible,
by adding XLH_INSERT_ALL_VISIBLE_SET & XLH_INSERT_ALL_FROZEN_SET, and
setting those when appropriate in heap_multi_insert.  Unfortunately
currently visibilitymap_set() doesn't really properly allow to do this,
as it has embedded WAL logging for heap.

I think we should remove the WAL logging from visibilitymap_set(), and
move it to a separate, heap specific, function. Right now different
tableams e.g. would have to reimplement visibilitymap_set(), so that's a
second need to separate that functionality.  Let me try to come up with
a proposal.


The patch currently does a vmbuffer_pin() while holding an exclusive
lwlock on the page. That's something we normally try to avoid - but I
think it's probably OK here, because INSERT_FROZEN can only be used to
insert into a new relfilenode (which thus no other session can see). I
think it's preferrable to have this logic in specific to the
INSERT_FROZEN path, rather than adding nontrivial complications to
RelationGetBufferForTuple().

I noticed that, before this patch, we do a
    if (vmbuffer != InvalidBuffer)
        ReleaseBuffer(vmbuffer);
after every filled page - that doesn't strike me as particularly smart -
it's pretty likely that the next heap page to be filled is going to be
on the same vm page as the previous iteration.


I noticed one small oddity that I think is common to all the approaches
presented in this thread so far. After

BEGIN;
TRUNCATE foo;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COPY foo(i) FROM '/tmp/foo' WITH FREEZE;
COMMIT;

we currently end up with pages like:
┌───────┬───────────┬──────────┬───────┬───────┬───────┬─────────┬──────────┬─────────┬───────────┐
│ blkno │    lsn    │ checksum │ flags │ lower │ upper │ special │ pagesize │ version │ prune_xid │
├───────┼───────────┼──────────┼───────┼───────┼───────┼─────────┼──────────┼─────────┼───────────┤
│     0 │ 0/50B5488 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     1 │ 0/50B6360 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     2 │ 0/50B71B8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     3 │ 0/50B8028 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     4 │ 0/50B8660 │        0 │     4 │   408 │  5120 │    8192 │     8192 │       4 │         0 │
│     5 │ 0/50B94B8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     6 │ 0/50BA328 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     7 │ 0/50BB180 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     8 │ 0/50BBFD8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│     9 │ 0/50BCF88 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    10 │ 0/50BDDE0 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    11 │ 0/50BEC50 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    12 │ 0/50BFAA8 │        0 │     4 │   928 │   960 │    8192 │     8192 │       4 │         0 │
│    13 │ 0/50C06F8 │        0 │     4 │   792 │  2048 │    8192 │     8192 │       4 │         0 │
└───────┴───────────┴──────────┴───────┴───────┴───────┴─────────┴──────────┴─────────┴───────────┘
(14 rows)

Note how block 4 has more space available. That's because the
visibilitymap_pin() called in the first COPY has to vm_extend(), which
in turn does:

    /*
     * Send a shared-inval message to force other backends to close any smgr
     * references they may have for this rel, which we are about to change.
     * This is a useful optimization because it means that backends don't have
     * to keep checking for creation or extension of the file, which happens
     * infrequently.
     */
    CacheInvalidateSmgr(rel->rd_smgr->smgr_rnode);

which invalidates ->rd_smgr->smgr_targblock *after* the first COPY,
because that's when the pending smgr invalidations are sent out.  That's
far from great, but it doesn't seem to be this patch's fault.

It seems to me we need a separate invalidation that doesn't close the
whole smgr relation, but just invalidates the VM specific fields.

Greetings,

Andres Freund

Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Andres Freund
Date:
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

- Andres



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Tue, 28 May 2019 at 4:36 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

Yes, I plan to work on it.

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:
Hi Andres,

On Wed, May 29, 2019 at 1:50 PM Pavan Deolasee <pavan.deolasee@gmail.com> wrote:


On Tue, 28 May 2019 at 4:36 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
> Here's a *prototype* patch for this.  It only implements what I
> described for heap_multi_insert, not for plain inserts. I wanted to see
> what others think before investing additional time.

Pavan, are you planning to work on this for v13 CF1? Or have you lost
interest on the topic?

Yes, I plan to work on it.


I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still remains unaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and finish the work. 

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Amit Kapila
Date:
On Thu, Jun 27, 2019 at 11:02 AM Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
>>> On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
>>> > Here's a *prototype* patch for this.  It only implements what I
>>> > described for heap_multi_insert, not for plain inserts. I wanted to see
>>> > what others think before investing additional time.
>>>
>>> Pavan, are you planning to work on this for v13 CF1? Or have you lost
>>> interest on the topic?
>>
>>
>> Yes, I plan to work on it.
>>
>
> I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still
remainsunaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and
finishthe work.
 
>

Fair enough, I have marked the entry [1] in the coming CF as "Returned
with Feedback".  I hope that is okay with you.

[1] - https://commitfest.postgresql.org/23/2009/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Sat, Jun 29, 2019 at 12:56 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Jun 27, 2019 at 11:02 AM Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
>
>>> On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
>>> > Here's a *prototype* patch for this.  It only implements what I
>>> > described for heap_multi_insert, not for plain inserts. I wanted to see
>>> > what others think before investing additional time.
>>>
>>> Pavan, are you planning to work on this for v13 CF1? Or have you lost
>>> interest on the topic?
>>
>>
>> Yes, I plan to work on it.
>>
>
> I am sorry, but I am not able to find time to get back to this because of other high priority items. If it still remains unaddressed in the next few weeks, I will pick it up again. But for now, I am happy if someone wants to pick and finish the work.
>

Fair enough, I have marked the entry [1] in the coming CF as "Returned
with Feedback".  I hope that is okay with you.

[1] - https://commitfest.postgresql.org/23/2009/



Hi,

As Pavan mentioned in the last email that he is no longer interested in that, so I want to
take the lead and want to finish that. It is a bug and needs to be fixed.
I have rebased and the patch with the latest master and added some test
cases (borrowed from Pavan's patch), and did some performance testing with a table size of
700MB (10Millions rows)

COPY WITH FREEZE took 21406.692ms and VACUUM took 2478.666ms
COPY WITH FREEZE took 23095.985ms and VACUUM took 26.309ms

The performance decrease in copy with the patch is only 7%, but we get
quite adequate performance in VACUUM. In any case, this is a bug fix, so we can ignore
the performance hit.

There are two issues left to address.

1 - Andres: It only implements what I described for heap_multi_insert, not for plain inserts.
I wanted to see what others think before investing additional time.

In which condition we need that for plain inserts?

2 - Andres:  I think we should remove the WAL logging from visibilitymap_set(), and
move it to a separate, heap specific, function. Right now different
tableams e.g. would have to reimplement visibilitymap_set(), so that's a
second need to separate that functionality.  Let me try to come up with
a proposal.




--
Ibrar Ahmed
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Fri, Mar 13, 2020 at 6:58 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Thanks for picking up this patch.  There's a minor typo:

+                        * readable outside of this sessoin. Therefore doing IO here isn't

=> session

--
Justin

Thanks, please see the updated and rebased patch. (master 17a28b03645e27d73bf69a95d7569b61e58f06eb)

--
Ibrar Ahmed
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Tue, Mar 24, 2020 at 10:06 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:


On Fri, Mar 13, 2020 at 6:58 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Thanks for picking up this patch.  There's a minor typo:

+                        * readable outside of this sessoin. Therefore doing IO here isn't

=> session

--
Justin

Thanks, please see the updated and rebased patch. (master 17a28b03645e27d73bf69a95d7569b61e58f06eb)

--
Ibrar Ahmed

Andres while fixing the one FIXME in the patch

"            visibilitymap_pin(relation, BufferGetBlockNumber(buffer), &vmbuffer);


            /*

             * FIXME: setting recptr here is a dirty dirty hack, to prevent

             * visibilitymap_set() from WAL logging.

             *
"
I am not able to see any scenario where recptr is not set before reaching to that statement. Can you clarify why you think recptr will not be set at that statement?


--
Ibrar Ahmed

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Daniel Gustafsson
Date:
This patch incurs a compiler warning, which probably is quite simple to fix:

heapam.c: In function ‘heap_multi_insert’:
heapam.c:2349:4: error: ‘recptr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
    ^
heapam.c:2136:14: note: ‘recptr’ was declared here
   XLogRecPtr recptr;
              ^

Please fix and submit a new version, I'm marking the entry Waiting on Author in
the meantime.

cheers ./daniel


Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 01.07.2020 12:38, Daniel Gustafsson wrote:
> This patch incurs a compiler warning, which probably is quite simple to fix:
>
> heapam.c: In function ‘heap_multi_insert’:
> heapam.c:2349:4: error: ‘recptr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>      visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
>      ^
> heapam.c:2136:14: note: ‘recptr’ was declared here
>     XLogRecPtr recptr;
>                ^
>
> Please fix and submit a new version, I'm marking the entry Waiting on Author in
> the meantime.
>
> cheers ./daniel
>
This patch looks very useful to me, so I want to pick it up.


The patch that fixes the compiler warning is in the attachment. Though, 
I'm not
entirely satisfied with this fix. Also, the patch contains some FIXME 
comments.
I'll test it more and send fixes this week.

Questions from the first review pass:

1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
implied by XLH_INSERT_ALL_FROZEN_SET.

2) What does this comment mean? Does XXX refers to the lsn comparison? 
Since it
is definitely necessary to update the VM.

+             * XXX: This seems entirely unnecessary?
+             *
+             * FIXME: Theoretically we should only do this after we've
+             * modified the heap - but it's safe to do it here I think,
+             * because this means that the page previously was empty.
+             */
+            if (lsn > PageGetLSN(vmpage))
+                visibilitymap_set(reln, blkno, InvalidBuffer, lsn, 
vmbuffer,
+                                  InvalidTransactionId, vmbits);

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Robert Haas
Date:
On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:
> Questions from the first review pass:
>
> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
> implied by XLH_INSERT_ALL_FROZEN_SET.

I agree that it looks unnecessary to have two separate bits.

> 2) What does this comment mean? Does XXX refers to the lsn comparison?
> Since it
> is definitely necessary to update the VM.
>
> +             * XXX: This seems entirely unnecessary?
> +             *
> +             * FIXME: Theoretically we should only do this after we've
> +             * modified the heap - but it's safe to do it here I think,
> +             * because this means that the page previously was empty.
> +             */
> +            if (lsn > PageGetLSN(vmpage))
> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
> vmbuffer,
> +                                  InvalidTransactionId, vmbits);

I wondered about that too. The comment which precedes it was, I
believe, originally written by me, and copied here from
heap_xlog_visible(). But it's not clear very good practice to just
copy the comment like this. If the same logic applies, the code should
say that we're doing the same thing here as in heap_xlog_visible() for
consistency, or some such thing; after all, that's the primary place
where that happens. But it looks like the XXX might have been added by
a second person who thought that we didn't need this logic at all, and
the FIXME by a third person who thought it was in the wrong place, so
the whole thing is really confusing at this point.

I'm pretty worried about this, too:

+             * FIXME: setting recptr here is a dirty dirty hack, to prevent
+             * visibilitymap_set() from WAL logging.

That is indeed a dirty hack, and something needs to be done about it.

I wonder if it was really all that smart to try to make the
HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
records to mark it all-visible afterwards, but I don't see any reason
why this can't be made to work. It needs substantially more polishing
than it's had, though, I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 31.07.2020 23:28, Robert Haas wrote:
> On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
>> Questions from the first review pass:
>>
>> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
>> implied by XLH_INSERT_ALL_FROZEN_SET.
> I agree that it looks unnecessary to have two separate bits.
>
>> 2) What does this comment mean? Does XXX refers to the lsn comparison?
>> Since it
>> is definitely necessary to update the VM.
>>
>> +             * XXX: This seems entirely unnecessary?
>> +             *
>> +             * FIXME: Theoretically we should only do this after we've
>> +             * modified the heap - but it's safe to do it here I think,
>> +             * because this means that the page previously was empty.
>> +             */
>> +            if (lsn > PageGetLSN(vmpage))
>> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
>> vmbuffer,
>> +                                  InvalidTransactionId, vmbits);
> I wondered about that too. The comment which precedes it was, I
> believe, originally written by me, and copied here from
> heap_xlog_visible(). But it's not clear very good practice to just
> copy the comment like this. If the same logic applies, the code should
> say that we're doing the same thing here as in heap_xlog_visible() for
> consistency, or some such thing; after all, that's the primary place
> where that happens. But it looks like the XXX might have been added by
> a second person who thought that we didn't need this logic at all, and
> the FIXME by a third person who thought it was in the wrong place, so
> the whole thing is really confusing at this point.
>
> I'm pretty worried about this, too:
>
> +             * FIXME: setting recptr here is a dirty dirty hack, to prevent
> +             * visibilitymap_set() from WAL logging.
>
> That is indeed a dirty hack, and something needs to be done about it.
>
> I wonder if it was really all that smart to try to make the
> HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> records to mark it all-visible afterwards, but I don't see any reason
> why this can't be made to work. It needs substantially more polishing
> than it's had, though, I think.
>
New version of the patch is in the attachment.

New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other 
places.

It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.

I haven't tested performance yet, though. Maybe after tests, I'll bring 
some optimizations back.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Mon, Aug 3, 2020 at 2:29 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 31.07.2020 23:28, Robert Haas wrote:
> On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
>> Questions from the first review pass:
>>
>> 1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
>> implied by XLH_INSERT_ALL_FROZEN_SET.
> I agree that it looks unnecessary to have two separate bits.
>
>> 2) What does this comment mean? Does XXX refers to the lsn comparison?
>> Since it
>> is definitely necessary to update the VM.
>>
>> +             * XXX: This seems entirely unnecessary?
>> +             *
>> +             * FIXME: Theoretically we should only do this after we've
>> +             * modified the heap - but it's safe to do it here I think,
>> +             * because this means that the page previously was empty.
>> +             */
>> +            if (lsn > PageGetLSN(vmpage))
>> +                visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
>> vmbuffer,
>> +                                  InvalidTransactionId, vmbits);
> I wondered about that too. The comment which precedes it was, I
> believe, originally written by me, and copied here from
> heap_xlog_visible(). But it's not clear very good practice to just
> copy the comment like this. If the same logic applies, the code should
> say that we're doing the same thing here as in heap_xlog_visible() for
> consistency, or some such thing; after all, that's the primary place
> where that happens. But it looks like the XXX might have been added by
> a second person who thought that we didn't need this logic at all, and
> the FIXME by a third person who thought it was in the wrong place, so
> the whole thing is really confusing at this point.
>
> I'm pretty worried about this, too:
>
> +             * FIXME: setting recptr here is a dirty dirty hack, to prevent
> +             * visibilitymap_set() from WAL logging.
>
> That is indeed a dirty hack, and something needs to be done about it.
>
> I wonder if it was really all that smart to try to make the
> HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
> records to mark it all-visible afterwards, but I don't see any reason
> why this can't be made to work. It needs substantially more polishing
> than it's had, though, I think.
>
New version of the patch is in the attachment.

New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other
places.

It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.

I haven't tested performance yet, though. Maybe after tests, I'll bring
some optimizations back.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Here are some performance results with a patched and unpatched master branch. 
The table used for the test contains three columns (integer, text, varchar). 
The total number of rows is 10000000 in total. 

Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 9069.432 ms vacuum; 2567.961ms
COPY: 9004.533 ms vacuum: 2553.075ms
COPY: 8832.422 ms vacuum: 2540.742ms

Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
COPY: 10031.723 ms vacuum: 127.524 ms
COPY: 9985.109  ms vacuum: 39.953 ms
COPY: 9283.373  ms vacuum: 37.137 ms

Time to take the copy slightly increased but the vacuum time significantly decrease.  

--
Ibrar Ahmed

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Hamid Akhtar
Date:
Unfortunately the latest patch doesn't apply cleanly on the master branch. Can you please share an updated one.

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Mon, Aug 17, 2020 at 2:19 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
Unfortunately the latest patch doesn't apply cleanly on the master branch. Can you please share an updated one.

Please see the attached patch rebased with master (a28d731a1187e8d9d8c2b6319375fcbf0a8debd5)
--
Ibrar Ahmed
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Alvaro Herrera
Date:
On 2020-Aug-14, Ibrar Ahmed wrote:

> The table used for the test contains three columns (integer, text,
> varchar).
> The total number of rows is 10000000 in total.
> 
> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
> COPY: 9069.432 ms vacuum; 2567.961ms
> COPY: 9004.533 ms vacuum: 2553.075ms
> COPY: 8832.422 ms vacuum: 2540.742ms
> 
> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
> COPY: 10031.723 ms vacuum: 127.524 ms
> COPY: 9985.109  ms vacuum: 39.953 ms
> COPY: 9283.373  ms vacuum: 37.137 ms
> 
> Time to take the copy slightly increased but the vacuum time significantly
> decrease.

"Slightly"?  It seems quite a large performance drop to me -- more than
10%.  Where is that time being spent?  Andres said in [1] that he
thought the performance shouldn't be affected noticeably, but this
doesn't seem to hold true.  As I understand, the idea was that there
would be little or no additional WAL records .. only flags in the
existing record.  So what is happening?

[1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

Also, when Andres posted this patch first, he said this was only for
heap_multi_insert because it was a prototype.  But I think we expect
that the table_insert path (CIM_SINGLE mode in copy) should also receive
that treatment.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 18.08.2020 02:54, Alvaro Herrera wrote:
> On 2020-Aug-14, Ibrar Ahmed wrote:
>
>> The table used for the test contains three columns (integer, text,
>> varchar).
>> The total number of rows is 10000000 in total.
>>
>> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 9069.432 ms vacuum; 2567.961ms
>> COPY: 9004.533 ms vacuum: 2553.075ms
>> COPY: 8832.422 ms vacuum: 2540.742ms
>>
>> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 10031.723 ms vacuum: 127.524 ms
>> COPY: 9985.109  ms vacuum: 39.953 ms
>> COPY: 9283.373  ms vacuum: 37.137 ms
>>
>> Time to take the copy slightly increased but the vacuum time significantly
>> decrease.
> "Slightly"?  It seems quite a large performance drop to me -- more than
> 10%.  Where is that time being spent?  Andres said in [1] that he
> thought the performance shouldn't be affected noticeably, but this
> doesn't seem to hold true.  As I understand, the idea was that there
> would be little or no additional WAL records .. only flags in the
> existing record.  So what is happening?
>
> [1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

I agree that 10% performance drop is not what we expect with this patch.
Ibrar, can you share more info about your tests? I'd like to reproduce 
this slowdown and fix it, if necessary.

I've run some tests on my laptop and COPY FREEZE shows the same time for 
both versions, while VACUUM is much faster on the patched version. I've 
also checked WAL generation and it shows that the patch works correctly 
as it doesn't add any records for COPY.

Not patched:

Time: 54883,356 ms (00:54,883)
Time: 65373,333 ms (01:05,373)
Time: 64684,592 ms (01:04,685)
VACUUM Time: 60861,670 ms (01:00,862)

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 6015 MB
table size                 5971 MB

Patched:

Time: 53142,947 ms (00:53,143)
Time: 65420,812 ms (01:05,421)
Time: 66600,114 ms (01:06,600)
VACUUM Time: 63,401 ms

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 30 kB
table size                 5971 MB

The test script is attached.

> Also, when Andres posted this patch first, he said this was only for
> heap_multi_insert because it was a prototype.  But I think we expect
> that the table_insert path (CIM_SINGLE mode in copy) should also receive
> that treatment.

I am afraid that extra checks for COPY FREEZE  in heap_insert() will 
slow down normal insertions.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:


On Wed, Aug 19, 2020 at 6:15 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 18.08.2020 02:54, Alvaro Herrera wrote:
> On 2020-Aug-14, Ibrar Ahmed wrote:
>
>> The table used for the test contains three columns (integer, text,
>> varchar).
>> The total number of rows is 10000000 in total.
>>
>> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 9069.432 ms vacuum; 2567.961ms
>> COPY: 9004.533 ms vacuum: 2553.075ms
>> COPY: 8832.422 ms vacuum: 2540.742ms
>>
>> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 10031.723 ms vacuum: 127.524 ms
>> COPY: 9985.109  ms vacuum: 39.953 ms
>> COPY: 9283.373  ms vacuum: 37.137 ms
>>
>> Time to take the copy slightly increased but the vacuum time significantly
>> decrease.
> "Slightly"?  It seems quite a large performance drop to me -- more than
> 10%.  Where is that time being spent?  Andres said in [1] that he
> thought the performance shouldn't be affected noticeably, but this
> doesn't seem to hold true.  As I understand, the idea was that there
> would be little or no additional WAL records .. only flags in the
> existing record.  So what is happening?
>
> [1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

I agree that 10% performance drop is not what we expect with this patch.
Ibrar, can you share more info about your tests? I'd like to reproduce
this slowdown and fix it, if necessary.

I've run some tests on my laptop and COPY FREEZE shows the same time for
both versions, while VACUUM is much faster on the patched version. I've
also checked WAL generation and it shows that the patch works correctly
as it doesn't add any records for COPY.

Not patched:

Time: 54883,356 ms (00:54,883)
Time: 65373,333 ms (01:05,373)
Time: 64684,592 ms (01:04,685)
VACUUM Time: 60861,670 ms (01:00,862)

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 6015 MB
table size                 5971 MB

Patched:

Time: 53142,947 ms (00:53,143)
Time: 65420,812 ms (01:05,421)
Time: 66600,114 ms (01:06,600)
VACUUM Time: 63,401 ms

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 30 kB
table size                 5971 MB

The test script is attached.

> Also, when Andres posted this patch first, he said this was only for
> heap_multi_insert because it was a prototype.  But I think we expect
> that the table_insert path (CIM_SINGLE mode in copy) should also receive
> that treatment.

I am afraid that extra checks for COPY FREEZE  in heap_insert() will
slow down normal insertions.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Here is my test;
 

postgres=# BEGIN;

BEGIN


postgres=*# TRUNCATE foo;

TRUNCATE TABLE


postgres=*# COPY foo(id, name, address) FROM '/home/ibrar/bar.csv' DELIMITER ',' FREEZE;   

COPY 10000000


postgres=*# COMMIT;

COMMIT


postgres=# VACUUM;

VACUUM


postgres=# SELECT count(*) FROM foo; 

  count   

----------

 10000000

(1 row)




--
Ibrar Ahmed

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 21.08.2020 19:43, Ibrar Ahmed wrote:


On Wed, Aug 19, 2020 at 6:15 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 18.08.2020 02:54, Alvaro Herrera wrote:
> On 2020-Aug-14, Ibrar Ahmed wrote:
>
>> The table used for the test contains three columns (integer, text,
>> varchar).
>> The total number of rows is 10000000 in total.
>>
>> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 9069.432 ms vacuum; 2567.961ms
>> COPY: 9004.533 ms vacuum: 2553.075ms
>> COPY: 8832.422 ms vacuum: 2540.742ms
>>
>> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 10031.723 ms vacuum: 127.524 ms
>> COPY: 9985.109  ms vacuum: 39.953 ms
>> COPY: 9283.373  ms vacuum: 37.137 ms
>>
>> Time to take the copy slightly increased but the vacuum time significantly
>> decrease.
> "Slightly"?  It seems quite a large performance drop to me -- more than
> 10%.  Where is that time being spent?  Andres said in [1] that he
> thought the performance shouldn't be affected noticeably, but this
> doesn't seem to hold true.  As I understand, the idea was that there
> would be little or no additional WAL records .. only flags in the
> existing record.  So what is happening?
>
> [1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

I agree that 10% performance drop is not what we expect with this patch.
Ibrar, can you share more info about your tests? I'd like to reproduce
this slowdown and fix it, if necessary.


Here is my test;
 

postgres=# BEGIN;

BEGIN


postgres=*# TRUNCATE foo;

TRUNCATE TABLE


postgres=*# COPY foo(id, name, address) FROM '/home/ibrar/bar.csv' DELIMITER ',' FREEZE;   

COPY 10000000



--
Ibrar Ahmed


I've repeated the test and didn't notice any slowdown for COPY FREEZE.
Test data is here [1].

The numbers do fluctuate a bit, but there is no dramatic difference between master and patched version. So I assume that the performance drop in your test has something to do with the measurement error. Unless, you have some non-default configuration that could affect it.

patched:

COPY: 12327,090 ms vacuum: 37,555 ms
COPY: 12939,540 ms vacuum: 35,703 ms
COPY: 12245,819 ms vacuum: 36,273 ms

master:
COPY
COPY: 13253,605 ms vacuum: 3592,849 ms
COPY: 12619,428 ms vacuum: 4253,836 ms
COPY: 12512,940 ms vacuum: 4009,847 ms

I also slightly cleaned up comments, so the new version of the patch is attached. As this is just a performance optimization documentation is not needed. It would be great, if other reviewers could run some independent performance tests, as I believe that this patch is ready for committer.

[1] https://drive.google.com/file/d/11r19NX6yyPjvxdDub8Ce-kmApRurp4Nx/view

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companyt
Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Ibrar Ahmed
Date:




On Thu, Aug 27, 2020 at 2:14 AM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 21.08.2020 19:43, Ibrar Ahmed wrote:


On Wed, Aug 19, 2020 at 6:15 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 18.08.2020 02:54, Alvaro Herrera wrote:
> On 2020-Aug-14, Ibrar Ahmed wrote:
>
>> The table used for the test contains three columns (integer, text,
>> varchar).
>> The total number of rows is 10000000 in total.
>>
>> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 9069.432 ms vacuum; 2567.961ms
>> COPY: 9004.533 ms vacuum: 2553.075ms
>> COPY: 8832.422 ms vacuum: 2540.742ms
>>
>> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 10031.723 ms vacuum: 127.524 ms
>> COPY: 9985.109  ms vacuum: 39.953 ms
>> COPY: 9283.373  ms vacuum: 37.137 ms
>>
>> Time to take the copy slightly increased but the vacuum time significantly
>> decrease.
> "Slightly"?  It seems quite a large performance drop to me -- more than
> 10%.  Where is that time being spent?  Andres said in [1] that he
> thought the performance shouldn't be affected noticeably, but this
> doesn't seem to hold true.  As I understand, the idea was that there
> would be little or no additional WAL records .. only flags in the
> existing record.  So what is happening?
>
> [1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

I agree that 10% performance drop is not what we expect with this patch.
Ibrar, can you share more info about your tests? I'd like to reproduce
this slowdown and fix it, if necessary.


Here is my test;
 

postgres=# BEGIN;

BEGIN


postgres=*# TRUNCATE foo;

TRUNCATE TABLE


postgres=*# COPY foo(id, name, address) FROM '/home/ibrar/bar.csv' DELIMITER ',' FREEZE;   

COPY 10000000



--
Ibrar Ahmed


I've repeated the test and didn't notice any slowdown for COPY FREEZE.
Test data is here [1].

The numbers do fluctuate a bit, but there is no dramatic difference between master and patched version. So I assume that the performance drop in your test has something to do with the measurement error. Unless, you have some non-default configuration that could affect it.

patched:

COPY: 12327,090 ms vacuum: 37,555 ms
COPY: 12939,540 ms vacuum: 35,703 ms
COPY: 12245,819 ms vacuum: 36,273 ms

master:
COPY
COPY: 13253,605 ms vacuum: 3592,849 ms
COPY: 12619,428 ms vacuum: 4253,836 ms
COPY: 12512,940 ms vacuum: 4009,847 ms

I also slightly cleaned up comments, so the new version of the patch is attached. As this is just a performance optimization documentation is not needed. It would be great, if other reviewers could run some independent performance tests, as I believe that this patch is ready for committer.

[1] https://drive.google.com/file/d/11r19NX6yyPjvxdDub8Ce-kmApRurp4Nx/view

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companyt
I gave another try with latest v3 patch on latest master branch (ff60394a8c9a7af8b32de420ccb54a20a0f019c1) with all default settings. 11824.495 is median with master and 11884.089 is median value with patch. 


Note: There are two changes such as (1) used the v3 patch (2) now test is done on latest master (ff60394a8c9a7af8b32de420ccb54a20a0f019c1).


Master (ff60394a8c9a7af8b32de420ccb54a20a0f019c1)

postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 11824.495 ms (00:11.824)

postgres=*# COMMIT;


Restart


postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 14096.987 ms (00:14.097)

postgres=*# commit;


Restart


postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 11108.289 ms (00:11.108)

postgres=*# commit;



Patched (ff60394a8c9a7af8b32de420ccb54a20a0f019c1)

postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 10749.945 ms (00:10.750)

postgres=*# commit;


Restart


postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 14274.361 ms (00:14.274)

postgres=*# commit;


Restart


postgres=# \timing

postgres=# BEGIN;

postgres=*# TRUNCATE  foo;

postgres=*# COPY foo(id, name, address) FROM '/Users/ibrar/bar.csv' DELIMITER ',' FREEZE;

Time: 11884.089 ms (00:11.884)

postgres=*# commit;


--
Ibrar Ahmed

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
Status update for a commitfest entry.

This patch is ReadyForCommitter. It applies and passes the CI. There are no unanswered questions in the discussion. 

The discussion started in 2015 with a patch by Jeff Janes. Later it was revived by Pavan Deolasee.  After it was picked
upby Ibrar Ahmed and finally, it was rewritten by me, so I moved myself from reviewers to authors as well.
 

The latest version was reviewed and tested by Ibrar Ahmed. The patch doesn't affect COPY FREEZE performance and
significantlydecreases the time of the following VACUUM. 

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tatsuo Ishii
Date:
> Status update for a commitfest entry.
> 
> This patch is ReadyForCommitter. It applies and passes the CI. There are no unanswered questions in the discussion. 
> 
> The discussion started in 2015 with a patch by Jeff Janes. Later it was revived by Pavan Deolasee.  After it was
pickedup by Ibrar Ahmed and finally, it was rewritten by me, so I moved myself from reviewers to authors as well.
 
> 
> The latest version was reviewed and tested by Ibrar Ahmed. The patch doesn't affect COPY FREEZE performance and
significantlydecreases the time of the following VACUUM.
 

I have tested the patch on my laptop (mem 16GB, SSD 512GB) using the
data introduced in up thread and saw that VACCUM after COPY FREEZE is
nearly 60 times faster than current master branch. Quite impressive.

By the way, I noticed following comment:
+            /*
+             * vmbuffer should be already pinned by RelationGetBufferForTuple,
+             * Though, it's fine if is not. all_frozen is just an optimization.
+             */

could be enhanced like below. What do you think?
+            /*
+             * vmbuffer should be already pinned by RelationGetBufferForTuple.
+             * Though, it's fine if it is not. all_frozen is just an optimization.
+             */


Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
Hi,

I might be somewhat late to the party, but I've done a bit of
benchmarking too ;-) I used TPC-H data from a 100GB test, and tried
different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on
current master and with the patch.

The results look like this (the columns say what combination of COPY and
VACUUM was used - e.g. -/FREEZE means plain COPY and VACUUM FREEZE)

master:

             - / -    FREEZE / -     - / FREEZE    FREEZE / FREEZE
   ----------------------------------------------------------------
      COPY    2471          2477           2486               2484
    VACUUM     228           209            453                206

patched:

             - / -    FREEZE / -     - / FREEZE    FREEZE / FREEZE
   ----------------------------------------------------------------
       COPY   2459          2445           2458               2446
     VACUUM    227             0            467                  0

So I don't really observe any measurable slowdowns in the COPY part (in
fact there seems to be a tiny speedup, but it might be just noise). In
the VACUUM part, there's clear speedup when the data was already frozen
by COPY (Yes, those are zeroes, because it took less than 1 second.)

So that looks pretty awesome, I guess.

For the record, these tests were run on a server with NVMe SSD, so
hopefully reliable and representative numbers.


A couple minor comments about the code:

1) Maybe add a comment before the block setting xlrec->flags in
heap_multi_insert. It's not very complex, but it used to be a bit
simpler, and all the other pieces around have comments, so it won't
hurt.


2) I find the "if (all_frozen_set)" block a bit strange. It's a matter
of personal preference, but I'd just use a single level of nesting, i.e.
something like this:

     /* if everything frozen, the whole page has to be visible */
     Assert(!(all_frozen_set && !PageIsAllVisible(page)));

     /*
      * If we've frozen everything on the page, and if we're already
      * holding pin on the vmbuffer, record that in the visibilitymap.
      * If we're not holding the pin, it's OK to skip this - it's just
      * an optimization.
      *
      * It's fine to use InvalidTransactionId here - this is only used
      * when HEAP_INSERT_FROZEN is specified, which intentionally
      * violates visibility rules.
      */
     if (all_frozen_set &&
         visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
    visibilitymap_set(...);

IMHO it's easier to read, but YMMV. I've also reworded the comment a bit
to say what we're doing etc. And I've moved the comment from inside the
if block into the main comment - that was making it harder to read too.


3) I see RelationGetBufferForTuple does this:

     /*
      * This is for COPY FREEZE needs. If page is empty,
      * pin vmbuffer to set all_frozen bit
      */
     if ((options & HEAP_INSERT_FROZEN) &&
         (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
         visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);

so is it actually possible to get into the (all_frozen_set) without
holding a pin on the visibilitymap? I haven't investigated this so
maybe there are other ways to get into that code block. But the new
additions to hio.c get the pin too.


4) In heap_xlog_multi_insert we now have this:

     if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
         PageClearAllVisible(page);
     if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
         PageSetAllVisible(page);

IIUC it makes no sense to have both flags at the same time, right? So
what about adding

     Assert(!(XLH_INSERT_ALL_FROZEN_SET && XLH_INSERT_ALL_VISIBLE_CLEARED));

to check that?


5) Not sure we need to explicitly say this is for COPY FREE in all the
blocks added to hio.c. IMO it's sufficient to use HEAP_INSERT_FROZEN in
the condition, at this level of abstraction.


I wonder what to do about the heap_insert - I know there are concerns it
would negatively impact regular insert, but is it really true? I suppose
this optimization would be valuable even for cases where multi-insert is
not possible.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 30.10.2020 03:42, Tomas Vondra wrote:
> Hi,
>
> I might be somewhat late to the party, but I've done a bit of
> benchmarking too ;-) I used TPC-H data from a 100GB test, and tried
> different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on
> current master and with the patch.
>
> So I don't really observe any measurable slowdowns in the COPY part (in
> fact there seems to be a tiny speedup, but it might be just noise). In
> the VACUUM part, there's clear speedup when the data was already frozen
> by COPY (Yes, those are zeroes, because it took less than 1 second.)
>
> So that looks pretty awesome, I guess.
>
> For the record, these tests were run on a server with NVMe SSD, so
> hopefully reliable and representative numbers.
>
Thank you for the review.

> A couple minor comments about the code:
>
> 2) I find the "if (all_frozen_set)" block a bit strange. It's a matter
> of personal preference, but I'd just use a single level of nesting, i.e.
> something like this:
>
>     /* if everything frozen, the whole page has to be visible */
>     Assert(!(all_frozen_set && !PageIsAllVisible(page)));
>
>     /*
>      * If we've frozen everything on the page, and if we're already
>      * holding pin on the vmbuffer, record that in the visibilitymap.
>      * If we're not holding the pin, it's OK to skip this - it's just
>      * an optimization.
>      *
>      * It's fine to use InvalidTransactionId here - this is only used
>      * when HEAP_INSERT_FROZEN is specified, which intentionally
>      * violates visibility rules.
>      */
>     if (all_frozen_set &&
>         visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
>     visibilitymap_set(...);
>
> IMHO it's easier to read, but YMMV. I've also reworded the comment a bit
> to say what we're doing etc. And I've moved the comment from inside the
> if block into the main comment - that was making it harder to read too.
>
I agree that it's a matter of taste. I've updated comments and left 
nesting unchanged to keep assertions simple.

>
> 3) I see RelationGetBufferForTuple does this:
>
>     /*
>      * This is for COPY FREEZE needs. If page is empty,
>      * pin vmbuffer to set all_frozen bit
>      */
>     if ((options & HEAP_INSERT_FROZEN) &&
>         (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
>         visibilitymap_pin(relation, BufferGetBlockNumber(buffer), 
> vmbuffer);
>
> so is it actually possible to get into the (all_frozen_set) without
> holding a pin on the visibilitymap? I haven't investigated this so
> maybe there are other ways to get into that code block. But the new
> additions to hio.c get the pin too.
>
I was thinking that GetVisibilityMapPins() can somehow unset the pin. I 
gave it a second look. And now I don't think it's possible to get into 
this code block without a pin.  So, I converted this check into an 
assertion.

>
> 4) In heap_xlog_multi_insert we now have this:
>
>     if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
>         PageClearAllVisible(page);
>     if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
>         PageSetAllVisible(page);
>
> IIUC it makes no sense to have both flags at the same time, right? So
> what about adding
>
>     Assert(!(XLH_INSERT_ALL_FROZEN_SET && 
> XLH_INSERT_ALL_VISIBLE_CLEARED));
>
> to check that?
>
Agree.

I placed this assertion to the very beginning of the function. It also 
helped to simplify the code a bit.
I also noticed, that we were not updating visibility map for all_frozen 
from heap_xlog_multi_insert. Fixed.

>
> I wonder what to do about the heap_insert - I know there are concerns it
> would negatively impact regular insert, but is it really true? I suppose
> this optimization would be valuable even for cases where multi-insert is
> not possible.
>
Do we have something like INSERT .. FREEZE? I only see 
TABLE_INSERT_FROZEN set for COPY FREEZE and for matview operations. Can 
you explain, what use-case are we trying to optimize by extending this 
patch to heap_insert()?

The new version is attached.
I've also fixed a typo in the comment by Tatsuo Ishii suggestion.
Also, I tested this patch with replication and found no issues.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
Hi,

I started looking at this patch again, hoping to get it committed in 
this CF, but I think there's a regression in handling TOAST tables 
(compared to the v3 patch submitted by Pavan in February 2019).

The test I'm running a very simple test (see test.sql):

1) start a transaction
2) create a table with a text column
3) copy freeze data into it
4) use pg_visibility to see how many blocks are all_visible both in the
    main table and it's TOAST table

For v3 patch (applied on top of 278584b526 and s/hi_options/ti_options) 
I get this:

            pages           NOT all_visible
   ------------------------------------------
   main       637                         0
   toast    50001                         3

There was some discussion about relcache invalidations causing a couple 
TOAST pages not be marked as all_visible, etc.

However, for this patch on master I get this

            pages           NOT all_visible
   ------------------------------------------
   main       637                         0
   toast    50001                     50001

So no pages in TOAST are marked as all_visible. I haven't investigated 
what's causing this, but IMO that needs fixing to make ths patch RFC.

Attached is the test script I'm using, and a v5 of the patch - rebased 
on current master, with some minor tweaks to comments etc.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 11.01.2021 01:35, Tomas Vondra wrote:
> Hi,
>
> I started looking at this patch again, hoping to get it committed in 
> this CF, but I think there's a regression in handling TOAST tables 
> (compared to the v3 patch submitted by Pavan in February 2019).
>
> The test I'm running a very simple test (see test.sql):
>
> 1) start a transaction
> 2) create a table with a text column
> 3) copy freeze data into it
> 4) use pg_visibility to see how many blocks are all_visible both in the
>    main table and it's TOAST table
>
> For v3 patch (applied on top of 278584b526 and 
> s/hi_options/ti_options) I get this:
>
>            pages           NOT all_visible
>   ------------------------------------------
>   main       637                         0
>   toast    50001                         3
>
> There was some discussion about relcache invalidations causing a 
> couple TOAST pages not be marked as all_visible, etc.
>
> However, for this patch on master I get this
>
>            pages           NOT all_visible
>   ------------------------------------------
>   main       637                         0
>   toast    50001                     50001
>
> So no pages in TOAST are marked as all_visible. I haven't investigated 
> what's causing this, but IMO that needs fixing to make ths patch RFC.
>
> Attached is the test script I'm using, and a v5 of the patch - rebased 
> on current master, with some minor tweaks to comments etc.
>

Thank you for attaching the test script. I reproduced the problem. This 
regression occurs because TOAST internally uses heap_insert().
You have asked upthread about adding this optimization to heap_insert().

I wrote a quick fix, see the attached patch 0002. The TOAST test passes 
now, but I haven't tested performance or any other use-cases yet.
I'm going to test it properly in a couple of days and share results.

With this change a lot of new code is repeated in heap_insert() and 
heap_multi_insert(). I think it's fine, because these functions already 
have a lot in common.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:

On 1/11/21 10:00 PM, Anastasia Lubennikova wrote:
> On 11.01.2021 01:35, Tomas Vondra wrote:
>> Hi,
>>
>> I started looking at this patch again, hoping to get it committed in 
>> this CF, but I think there's a regression in handling TOAST tables 
>> (compared to the v3 patch submitted by Pavan in February 2019).
>>
>> The test I'm running a very simple test (see test.sql):
>>
>> 1) start a transaction
>> 2) create a table with a text column
>> 3) copy freeze data into it
>> 4) use pg_visibility to see how many blocks are all_visible both in the
>>    main table and it's TOAST table
>>
>> For v3 patch (applied on top of 278584b526 and 
>> s/hi_options/ti_options) I get this:
>>
>>            pages           NOT all_visible
>>   ------------------------------------------
>>   main       637                         0
>>   toast    50001                         3
>>
>> There was some discussion about relcache invalidations causing a 
>> couple TOAST pages not be marked as all_visible, etc.
>>
>> However, for this patch on master I get this
>>
>>            pages           NOT all_visible
>>   ------------------------------------------
>>   main       637                         0
>>   toast    50001                     50001
>>
>> So no pages in TOAST are marked as all_visible. I haven't investigated 
>> what's causing this, but IMO that needs fixing to make ths patch RFC.
>>
>> Attached is the test script I'm using, and a v5 of the patch - rebased 
>> on current master, with some minor tweaks to comments etc.
>>
> 
> Thank you for attaching the test script. I reproduced the problem. This 
> regression occurs because TOAST internally uses heap_insert().
> You have asked upthread about adding this optimization to heap_insert().
> 
> I wrote a quick fix, see the attached patch 0002. The TOAST test passes 
> now, but I haven't tested performance or any other use-cases yet.
> I'm going to test it properly in a couple of days and share results.
> 

Thanks. I think it's important to make this work for TOAST tables - it 
often stores most of the data, and it was working in v3 of the patch. I 
haven't looked into the details, but if it's really just due to TOAST 
using heap_insert, I'd say it just confirms the importance of tweaking 
heap_insert too.

> With this change a lot of new code is repeated in heap_insert() and 
> heap_multi_insert(). I think it's fine, because these functions already 
> have a lot in common.
> 

Understood. IMHO a bit of redundancy is not a big issue, but I haven't 
looked at the code yet. Let's get it working first, then we can decide 
if some refactoring is appropriate.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 12.01.2021 00:51, Tomas Vondra wrote:
>
>
> On 1/11/21 10:00 PM, Anastasia Lubennikova wrote:
>> On 11.01.2021 01:35, Tomas Vondra wrote:
>>> Hi,
>>>
>>> I started looking at this patch again, hoping to get it committed in 
>>> this CF, but I think there's a regression in handling TOAST tables 
>>> (compared to the v3 patch submitted by Pavan in February 2019).
>>>
>>> The test I'm running a very simple test (see test.sql):
>>>
>>> 1) start a transaction
>>> 2) create a table with a text column
>>> 3) copy freeze data into it
>>> 4) use pg_visibility to see how many blocks are all_visible both in the
>>>    main table and it's TOAST table
>>>
>>> For v3 patch (applied on top of 278584b526 and 
>>> s/hi_options/ti_options) I get this:
>>>
>>>            pages           NOT all_visible
>>>   ------------------------------------------
>>>   main       637                         0
>>>   toast    50001                         3
>>>
>>> There was some discussion about relcache invalidations causing a 
>>> couple TOAST pages not be marked as all_visible, etc.
>>>
>>> However, for this patch on master I get this
>>>
>>>            pages           NOT all_visible
>>>   ------------------------------------------
>>>   main       637                         0
>>>   toast    50001                     50001
>>>
>>> So no pages in TOAST are marked as all_visible. I haven't 
>>> investigated what's causing this, but IMO that needs fixing to make 
>>> ths patch RFC.
>>>
>>> Attached is the test script I'm using, and a v5 of the patch - 
>>> rebased on current master, with some minor tweaks to comments etc.
>>>
>>
>> Thank you for attaching the test script. I reproduced the problem. 
>> This regression occurs because TOAST internally uses heap_insert().
>> You have asked upthread about adding this optimization to heap_insert().
>>
>> I wrote a quick fix, see the attached patch 0002. The TOAST test 
>> passes now, but I haven't tested performance or any other use-cases yet.
>> I'm going to test it properly in a couple of days and share results.
>>
>
> Thanks. I think it's important to make this work for TOAST tables - it 
> often stores most of the data, and it was working in v3 of the patch. 
> I haven't looked into the details, but if it's really just due to 
> TOAST using heap_insert, I'd say it just confirms the importance of 
> tweaking heap_insert too. 


I've tested performance. All tests were run on my laptop, latest master 
with and without patches, all default settings, except disabled 
autovacuum and installed pg_stat_statements extension.

The VACUUM is significantly faster with the patch, as it only checks 
visibility map. COPY speed fluctuates a lot between tests, but I didn't 
notice any trend.
I would expect minor slowdown with the patch, as we need to handle 
visibility map pages during COPY FREEZE. But in some runs, patched 
version was faster than current master, so the impact of the patch is 
insignificant.

I run 3 different tests:

1) Regular table (final size 5972 MB)

patched           |   master

COPY FREEZE data 3 times:

33384,544 ms   31135,037 ms
31666,226 ms   31158,294 ms
32802,783 ms   33599,234 ms

VACUUM
54,185 ms         48445,584 ms


2) Table with TOAST (final size 1207 MB where 1172 MB is in toast table)

patched             |   master

COPY FREEZE data 3 times:

368166,743 ms    383231,077 ms
398695,018 ms    454572,630 ms
410174,682 ms    567847,288 ms

VACUUM
43,159 ms            6648,302 ms


3) Table with a trivial BEFORE INSERT trigger (final size 5972 MB)

patched             |   master

COPY FREEZE data 3 times:

73544,225 ms      64967,802 ms
90960,584 ms      71826,362 ms
81356,025 ms      80023,041 ms

VACUUM
49,626 ms            40100,097 ms

I took another look at the yesterday's patch and it looks fine to me. So 
now I am waiting for your review.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
Thanks. These patches seem to resolve the TOAST table issue, freezing it 
as expected. I think the code duplication is not an issue, but I wonder 
why heap_insert uses this condition:

     /*
      * ...
      *
      * No need to update the visibilitymap if it had all_frozen bit set
      * before this insertion.
      */
     if (all_frozen_set && ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0))

while heap_multi_insert only does this:

     if (all_frozen_set) { ... }

I haven't looked at the details, but shouldn't both do the same thing?

I've done some benchmarks, comparing master and patched version on a 
bunch of combinations (logged/unlogged, no before-insert trigger, 
trigger filtering everything/nothing). On master, the results are:

     group                         copy       vacuum
     -----------------------------------------------
     logged / no trigger           4672          162
     logged / trigger (all)        4914          162
     logged / trigger (none)       1219           11
     unlogged / no trigger         4132          156
     unlogged / trigger (all)      4292          156
     unlogged / trigger (none)     1275           11

and patched looks like this:

     group                         copy       vacuum
     -----------------------------------------------
     logged / no trigger           4669           12
     logged / trigger (all)        4874           12
     logged / trigger (none)       1233           11
     unlogged / no trigger         4054           11
     unlogged / trigger (all)      4185           12
     unlogged / trigger (none)     1222           10

This looks pretty nice - there are no regressions, just speedups in the 
vacuum step. The SQL script used is attached.

However, I've also repeated the test counting all-frozen pages in both 
the main table and TOAST table, and I get this:


master
======

select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't'));

  count
--------
  100000
(1 row)


select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't')) where not all_visible;

  count
--------
  100000
(1 row)


patched
=======

select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't'));

  count
--------
  100002
(1 row)


select count(*) from pg_visibility((select reltoastrelid from pg_class 
where relname = 't')) where not all_visible;

  count
--------
       0
(1 row)

That is - all TOAST pages are frozen (as expected, which is good). But 
now there are 100002 pages, not just 100000 pages. That is, we're now 
creating 2 extra pages, for some reason. I recall Pavan reported similar 
issue with every 32768-th page not being properly filled, but I'm not 
sure if that's the same issue.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Anastasia Lubennikova
Date:
On 12.01.2021 22:30, Tomas Vondra wrote:
> Thanks. These patches seem to resolve the TOAST table issue, freezing 
> it as expected. I think the code duplication is not an issue, but I 
> wonder why heap_insert uses this condition:
>
>     /*
>      * ...
>      *
>      * No need to update the visibilitymap if it had all_frozen bit set
>      * before this insertion.
>      */
>     if (all_frozen_set && ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0))
>
> while heap_multi_insert only does this:
>
>     if (all_frozen_set) { ... }
>
> I haven't looked at the details, but shouldn't both do the same thing?


I decided to add this check for heap_insert() to avoid unneeded calls of 
visibilitymap_set(). If we insert tuples one by one, we can only call 
this once per page.
In my understanding, heap_multi_insert() inserts tuples in batches, so 
it doesn't need this optimization.

>
>
> However, I've also repeated the test counting all-frozen pages in both 
> the main table and TOAST table, and I get this:
>
> patched
> =======
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class 
> where relname = 't'));
>
>  count
> --------
>  100002
> (1 row)
>
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class 
> where relname = 't')) where not all_visible;
>
>  count
> --------
>       0
> (1 row)
>
> That is - all TOAST pages are frozen (as expected, which is good). But 
> now there are 100002 pages, not just 100000 pages. That is, we're now 
> creating 2 extra pages, for some reason. I recall Pavan reported 
> similar issue with every 32768-th page not being properly filled, but 
> I'm not sure if that's the same issue.
>
>
> regards
>

As Pavan correctly figured it out before the problem is that 
RelationGetBufferForTuple() moves to the next page, losing free space in 
the block:

 > ... I see that a relcache invalidation arrives
 > after 1st and then after every 32672th block is filled. That clears the
 > rel->rd_smgr field and we lose the information about the saved target
 > block. The code then moves to extend the relation again and thus 
skips the
 > previously less-than-half filled block, losing the free space in that 
block.

The reason of this cache invalidation is vm_extend() call, which happens 
every 32762 blocks.

RelationGetBufferForTuple() tries to use the last page, but for some 
reason this code is under 'use_fsm' check. And COPY FROM doesn't use fsm 
(see TABLE_INSERT_SKIP_FSM).


         /*
          * If the FSM knows nothing of the rel, try the last page before we
          * give up and extend.  This avoids one-tuple-per-page syndrome 
during
          * bootstrapping or in a recently-started system.
          */
         if (targetBlock == InvalidBlockNumber)
         {
             BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
             if (nblocks > 0)
                 targetBlock = nblocks - 1;
         }


I think we can use this code without regard to 'use_fsm'. With this 
change, the number of toast rel pages is correct. The patch is attached.

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
On 1/16/21 4:11 PM, Anastasia Lubennikova wrote:
>
 > ...
> 
> As Pavan correctly figured it out before the problem is that 
> RelationGetBufferForTuple() moves to the next page, losing free space in 
> the block:
> 
>  > ... I see that a relcache invalidation arrives
>  > after 1st and then after every 32672th block is filled. That clears the
>  > rel->rd_smgr field and we lose the information about the saved target
>  > block. The code then moves to extend the relation again and thus 
> skips the
>  > previously less-than-half filled block, losing the free space in that 
> block.
> 
> The reason of this cache invalidation is vm_extend() call, which happens 
> every 32762 blocks.
> 
> RelationGetBufferForTuple() tries to use the last page, but for some 
> reason this code is under 'use_fsm' check. And COPY FROM doesn't use fsm 
> (see TABLE_INSERT_SKIP_FSM).
> 
> 
>          /*
>           * If the FSM knows nothing of the rel, try the last page 
> before we
>           * give up and extend.  This avoids one-tuple-per-page syndrome 
> during
>           * bootstrapping or in a recently-started system.
>           */
>          if (targetBlock == InvalidBlockNumber)
>          {
>              BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
>              if (nblocks > 0)
>                  targetBlock = nblocks - 1;
>          }
> 
> 
> I think we can use this code without regard to 'use_fsm'. With this 
> change, the number of toast rel pages is correct. The patch is attached.
> 

Thanks for the updated patch, this version looks OK to me - I've marked 
it as RFC. I'll do a bit more testing, review, and then I'll get it 
committed.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
On 1/16/21 11:18 PM, Tomas Vondra wrote:
> ...
 >
> Thanks for the updated patch, this version looks OK to me - I've marked 
> it as RFC. I'll do a bit more testing, review, and then I'll get it 
> committed.
> 

Pushed. Thanks everyone for the effort put into this patch. The first 
version was sent in 2015, so it took quite a bit of time.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tatsuo Ishii
Date:
> Pushed. Thanks everyone for the effort put into this patch. The first
> version was sent in 2015, so it took quite a bit of time.

Great news. Thanks everyone who have been working on this.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Pavan Deolasee
Date:


On Mon, Jan 18, 2021 at 3:02 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:


Pushed. Thanks everyone for the effort put into this patch. The first
version was sent in 2015, so it took quite a bit of time.


Thanks Tomas, Anastasia and everyone else who worked on the patch and ensured that it gets into the tree.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tom Lane
Date:
Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> On Mon, Jan 18, 2021 at 3:02 AM Tomas Vondra <tomas.vondra@enterprisedb.com>
> wrote:
>> Pushed. Thanks everyone for the effort put into this patch. The first
>> version was sent in 2015, so it took quite a bit of time.

> Thanks Tomas, Anastasia and everyone else who worked on the patch and
> ensured that it gets into the tree.

Buildfarm results suggest that the test case is unstable under
CLOBBER_CACHE_ALWAYS:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-01-19%2020%3A27%3A46

This might mean an actual bug, or just that the test isn't robust.

            regards, tom lane



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:

On 1/23/21 1:10 AM, Tom Lane wrote:
> Pavan Deolasee <pavan.deolasee@gmail.com> writes:
>> On Mon, Jan 18, 2021 at 3:02 AM Tomas Vondra <tomas.vondra@enterprisedb.com>
>> wrote:
>>> Pushed. Thanks everyone for the effort put into this patch. The first
>>> version was sent in 2015, so it took quite a bit of time.
> 
>> Thanks Tomas, Anastasia and everyone else who worked on the patch and
>> ensured that it gets into the tree.
> 
> Buildfarm results suggest that the test case is unstable under
> CLOBBER_CACHE_ALWAYS:
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2021-01-19%2020%3A27%3A46
> 
> This might mean an actual bug, or just that the test isn't robust.
> 

Yeah :-( It seems I've committed the v5 patch, not the v7 addressing 
exactly this issue (which I've actually pointed out and asked to be 
fixed). Oh well ... I'll get this fixed tomorrow - I have the fix, and I 
have verified that it passes with CLOBBER_CACHE_ALWAYS, but pushing it 
at 5AM does not seem like a good idea.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From
Tomas Vondra
Date:
Hi,

I've pushed the fix, after a a couple extra rounds of careful testing.

I noticed that the existing pg_visibility regression tests don't check 
if we freeze the TOAST rows too (failing to do that was one of the 
symptoms). It'd be good to add that, because that would fail even 
without CLOBBER_CACHE_ALWAYS, so attached is a test I propose to add.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment