Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits - Mailing list pgsql-hackers

From Ibrar Ahmed
Subject Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date
Msg-id CALtqXTeEpfYwwAAtbW4RkZLNQoCJKfvhCF6syKGTmc5VEpuh8w@mail.gmail.com
Whole thread Raw
In response to Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
List pgsql-hackers


On Wed, Aug 19, 2020 at 6:15 PM Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
On 18.08.2020 02:54, Alvaro Herrera wrote:
> On 2020-Aug-14, Ibrar Ahmed wrote:
>
>> The table used for the test contains three columns (integer, text,
>> varchar).
>> The total number of rows is 10000000 in total.
>>
>> Unpatched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 9069.432 ms vacuum; 2567.961ms
>> COPY: 9004.533 ms vacuum: 2553.075ms
>> COPY: 8832.422 ms vacuum: 2540.742ms
>>
>> Patched (Master: 92c12e46d5f1e25fc85608a6d6a19b8f5ea02600)
>> COPY: 10031.723 ms vacuum: 127.524 ms
>> COPY: 9985.109  ms vacuum: 39.953 ms
>> COPY: 9283.373  ms vacuum: 37.137 ms
>>
>> Time to take the copy slightly increased but the vacuum time significantly
>> decrease.
> "Slightly"?  It seems quite a large performance drop to me -- more than
> 10%.  Where is that time being spent?  Andres said in [1] that he
> thought the performance shouldn't be affected noticeably, but this
> doesn't seem to hold true.  As I understand, the idea was that there
> would be little or no additional WAL records .. only flags in the
> existing record.  So what is happening?
>
> [1] https://postgr.es/m/20190408010427.4l63qr7h2fjcyp77@alap3.anarazel.de

I agree that 10% performance drop is not what we expect with this patch.
Ibrar, can you share more info about your tests? I'd like to reproduce
this slowdown and fix it, if necessary.

I've run some tests on my laptop and COPY FREEZE shows the same time for
both versions, while VACUUM is much faster on the patched version. I've
also checked WAL generation and it shows that the patch works correctly
as it doesn't add any records for COPY.

Not patched:

Time: 54883,356 ms (00:54,883)
Time: 65373,333 ms (01:05,373)
Time: 64684,592 ms (01:04,685)
VACUUM Time: 60861,670 ms (01:00,862)

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 6015 MB
table size                 5971 MB

Patched:

Time: 53142,947 ms (00:53,143)
Time: 65420,812 ms (01:05,421)
Time: 66600,114 ms (01:06,600)
VACUUM Time: 63,401 ms

COPY      wal_bytes 3765 MB
VACUUM wal_bytes 30 kB
table size                 5971 MB

The test script is attached.

> Also, when Andres posted this patch first, he said this was only for
> heap_multi_insert because it was a prototype.  But I think we expect
> that the table_insert path (CIM_SINGLE mode in copy) should also receive
> that treatment.

I am afraid that extra checks for COPY FREEZE  in heap_insert() will
slow down normal insertions.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Here is my test;
 

postgres=# BEGIN;

BEGIN


postgres=*# TRUNCATE foo;

TRUNCATE TABLE


postgres=*# COPY foo(id, name, address) FROM '/home/ibrar/bar.csv' DELIMITER ',' FREEZE;   

COPY 10000000


postgres=*# COMMIT;

COMMIT


postgres=# VACUUM;

VACUUM


postgres=# SELECT count(*) FROM foo; 

  count   

----------

 10000000

(1 row)




--
Ibrar Ahmed

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Next
From: Magnus Hagander
Date:
Subject: Re: run pgindent on a regular basis / scripted manner