Re: Yet another fast GiST build (typo) - Mailing list pgsql-hackers

From Andrey M. Borodin
Subject Re: Yet another fast GiST build (typo)
Date
Msg-id 12CC4B83-9A5D-4A59-85A1-215CF9D1AB4B@yandex-team.ru
Whole thread Raw
In response to Re: Yet another fast GiST build (typo)  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers

> 6 сент. 2020 г., в 18:26, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>
> On 05/09/2020 14:53, Andrey M. Borodin wrote:
>> Thanks for ideas, Heikki. Please see v13 with proposed changes.
>
> Thanks, that was quick!
>
>> But I've found out that logging page-by-page slows down GiST build by
>> approximately 15% (when CPU constrained). Though In think that this
>> is IO-wise.
> Hmm, any ideas why that is? log_newpage_range() writes one WAL record for 32 pages, while now you're writing one
recordper page, so you'll have a little bit more overhead from that. But 15% seems like a lot. 
I do not know. I guess this can be some effect of pglz compression during cold stage. It can be slower and less
compressivethan pglz with cache table? But this is pointing into the sky. 
Nevertheless, here's the patch identical to v13, but with 3rd part: log flushed pages with bunches of 32.
This brings CPU performance back and slightly better than before page-by-page logging.

Some details about test:
MacOS, 6-core i7
psql -c '\timing' -c "create table x as select point (random(),random()) from generate_series(1,10000000,1);" -c
"createindex on x using gist (point);" 

With patch v13 this takes 20,567 seconds, with v14 18,149 seconds, v12 ~18,3s (which is closer to 10% btw, sorry for
miscomputation).This was not statistically significant testing, just a quick laptop benchmark with 2-3 tests to verify
stability.

Best regards, Andrey Borodin.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] - Provide robust alternatives for replace_string
Next
From: Andres Freund
Date:
Subject: Re: Improving connection scalability: GetSnapshotData()