Re: Better tracking of free space during SP-GiST index build - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Better tracking of free space during SP-GiST index build
Date
Msg-id 8f0333cb-541d-cd4e-0014-86918b4259d2@2ndquadrant.com
Whole thread Raw
In response to Re: Better tracking of free space during SP-GiST index build  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Better tracking of free space during SP-GiST index build  (Oleg Bartunov <obartunov@gmail.com>)
List pgsql-hackers
On 09/22/2016 07:37 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>
>> ... I've tried increasing the cache size to 768
>> entries, with vast majority of them (~600) allocated to leaf pages.
>> Sadly, this seems to only increase the CREATE INDEX duration a bit,
>> without making the index significantly smaller (still ~120MB).
>
> Yeah, that's in line with my results: not much further gain from a
> larger cache.  Though if you were testing with the same IRRExplorer
> data, it's not surprising that our results would match.  Would be
> good to try some other cases...
>

Agreed, but I don't have any other data sets at hand. One possibility 
would be to generate something randomly (e.g. it's not particularly 
difficult to generate random IP addresses), but I'd much rather use some 
real-world data sets.
>>
>> One thing I'd change is making the SpGistLUPCache dynamic, i.e.
>> storing the size and lastUsedPagesMap on the meta page. That
>> should allow us resizing the cache and tweak lastUsedPagesMap in
>> the future.
>
> Yeah, probably a good idea. I had thought of bumping
> SPGIST_MAGIC_NUMBER again if we want to revisit the cache size; but
> keeping it as a separate field won't add noticeable cost, and it
> might save some trouble.
>

I see you plan to track only the cache size, while I proposed to track 
also the map, i.e. number of pages per category. I think that'd useful 
in case we come up with better values (e.g. more entries for leaf 
pages), or even somewhat adaptive way.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Hash Indexes
Next
From: Michael Paquier
Date:
Subject: Re: Rebranding OS X as macOS