Re: GiST VACUUM - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: GiST VACUUM
Date
Msg-id 5F278E38-FD10-45EB-88D0-5B1D7108F984@yandex-team.ru
Whole thread Raw
In response to Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: GiST VACUUM  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers

> 13 июля 2018 г., в 18:10, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>
>>>> But the situation in gistdoinsert(), where you encounter a deleted leaf page, could happen during normal
operation,if vacuum runs concurrently with an insert. Insertion locks only one page at a time, as it descends the tree,
soafter it has released the lock on the parent, but before it has locked the child, vacuum might have deleted the page.
Inthe latest patch, you're checking for that just before swapping the shared lock for an exclusive one, but I think
that'swrong; you need to check for that after swapping the lock, because otherwise vacuum might delete the page while
you'renot holding the lock. 
>>> Looks like a valid concern, I'll move that code again.
>> Done.
>
> Ok, the comment now says:
>
>> +            /*
>> +             * Leaf pages can be left deleted but still referenced in case of
>> +             * crash during VACUUM's gistbulkdelete()
>> +             */
>
> But that's not accurate, right? You should never see deleted pages after a crash, because the parent is updated in
thesame WAL record as the child page, right? 
Fixed the comment.
>
> I'm still a bit scared about using pd_prune_xid to store the XID that prevents recycling the page too early. Can we
usesome field in GISTPageOpaqueData for that, similar to how the B-tree stores it in BTPageOpaqueData? 
There is no room in opaque data, but, technically all page is just a tombstone until reused, so we can pick arbitrary
place.PFA v7 where xid after delete is stored in opaque data, but we can use other places, like line pointer array or
opaque-1

> 13 июля 2018 г., в 18:25, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>
> Looking at the second patch, to scan the GiST index in physical order, that seems totally unsafe, if there are any
concurrentpage splits. In the logical scan, pushStackIfSplited() deals with that, by comparing the page's NSN with the
parent'sLSN. But I don't see anything like that in the physical scan code. 

Leaf page can be pointed by internal page and rightlink simultaneously. The purpose of NSN is to visit this page
exactlyonce by following only on of two links in a scan. This is achieved naturally if we read everything from the
beginningto the end. (That is how I understand, I can be wrong) 

> I think we can do something similar in the physical scan: remember the current LSN position at the beginning of the
vacuum,and compare with that. The B-tree code uses the "cycle ID" for similar purposes. 
>
> Do we still need the separate gistvacuumcleanup() pass, if we scan the index in physical order in the bulkdelete pass
already?

We do not need to gather stats there, but we are doing RecordFreeIndexPage() and IndexFreeSpaceMapVacuum(). Is it
correctto move this to first scan? 

Best regards, Andrey Borodin.


Attachment

pgsql-hackers by date:

Previous
From: Don Seiler
Date:
Subject: Re: [PATCH] Include application_name in "connection authorized" log message
Next
From: Heikki Linnakangas
Date:
Subject: Re: pgsql: Fix parallel index and index-only scans to fall back toserial.