Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date
Msg-id CABOikdM-fmuoC9AFY5bBON+TTdv6YHBpYZRS1MYOCw50Kij6CA@mail.gmail.com
Whole thread Raw
In response to Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Darafei Praliaskouski <me@komzpa.net>)
Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers


On Thu, Mar 14, 2019 at 3:54 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

>
>
> Ok. I will run some tests. But please note that this patch is a bug fix to address the performance issue that is caused by having to rewrite the entire table when all-visible bit is set on the page during first vacuum. So while we may do some more work during COPY FREEZE, we're saving a lot of page writes during next vacuum. Also, since the scan that we are doing in this patch is done on a page that should be in the buffer cache, we will pay a bit in terms of CPU cost, but not anything in terms of IO cost.

Agreed. I had been misunderstanding this patch. The page scan during
COPY FREEZE is necessary and it's very cheaper than doing in the first
vacuum.

Thanks for agreeing to the need of this bug fix. I ran some simple tests anyways and here are the results.

The test consists of a simple table with three columns, two integers and one char(100). I then ran COPY (FREEZE), loading 7M rows, followed by a VACUUM. The total size of the raw data is about 800MB and the table size in Postgres is just under 1GB. The results for 3 runs in milliseconds are:

Master:
 COPY FREEZE: 40243.725   40309.675    40783.836
 VACUUM: 2685.871  2517.445    2508.452 

Patched:
 COPY FREEZE: 40942.410  40495.303   40638.075
 VACUUM: 25.067  35.793   25.390

So there is a slight increase in the time to run COPY FREEZE, but a significant reduction in time to VACUUM the table. The benefits will only go up if the table is vacuumed much  later when most of the pages are already written to the disk and removed from shared buffers and/or kernel cache.

I hope this satisfies your doubts regarding performance implications of the patch.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Automated way to find actual COMMIT LSN of subxact LSN
Next
From: Darafei Praliaskouski
Date:
Subject: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits