Re: GiST VACUUM - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: GiST VACUUM
Date
Msg-id 96ec7ebd-42b9-4df5-18a4-42181c8a5a41@iki.fi
Whole thread Raw
In response to Re: GiST VACUUM  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: GiST VACUUM
List pgsql-hackers
On 04/01/2019 02:47, Andrey Borodin wrote:
>> 2 янв. 2019 г., в 20:35, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>>
>> In patch #1, to do the vacuum scan in physical order:
>> ...
>> I think this is ready to be committed, except that I didn't do any testing. We discussed the need for testing
earlier.Did you write some test scripts for this, or how have you been testing?
 
> Please see test I used to check left jumps for v18:
> 0001-Physical-GiST-scan-in-VACUUM-v18-with-test-modificat.patch
> 0002-Test-left-jumps-v18.patch
> 
> To trigger FollowRight GiST sometimes forget to clear follow-right marker simulating crash of an insert. This fills
logswith "fixing incomplete split" messages. Search for "REMOVE THIS" to disable these ill-behavior triggers.
 
> To trigger NSN jump GiST allocate empty page after every real allocation.
> 
> In log output I see
> 2019-01-03 22:27:30.028 +05 [54596] WARNING:  RESCAN TRIGGERED BY NSN
> WARNING:  RESCAN TRIGGERED BY NSN
> 2019-01-03 22:27:30.104 +05 [54596] WARNING:  RESCAN TRIGGERED BY FollowRight
> This means that code paths were really executed (for v18).

Thanks! As I noted at 
https://www.postgresql.org/message-id/2ff57b1f-01b4-eacf-36a2-485a12017f6e%40iki.fi, 
the test patch left the index corrupt. I fixed it so that it leaves 
behind incompletely split pages, without the corruption, see attached. 
It's similar to yours, but let me recap what it does:

* Hacks gistbuild(), create 100 empty pages immediately after the root 
pages. They are leaked, so they won't be reused until the a VACUUM sees 
them and puts them to the FSM

* Hacks gistinserttuples(), to leave the split incompleted with 50% 
probability

* In vacuum, print a line to the log whenever it needs to "jump left"

I used this patch, with the attached test script that's similar to 
yours, but it also tries to verify that the index returns correct 
results. It prints a result set like this:

    sum
---------
  -364450
   364450
(2 rows)

If the two numbers add up to 0, the index seems valid. And you should 
see "RESCAN" lines in the log, to show that jumping left happened. 
Because the behavior is random and racy, you may need to run the script 
many times to see both "RESCAN TRIGGERED BY NSN" and "RESCAN TRIGGERED 
BY FollowRight" cases. Especially the "FollowRight" case happens less 
frequently than the NSN case, you might need to run the script > 10 
times to see it.

I also tried your amcheck tool with this. It did not report any errors.

Attached is also latest version of the patch itself. It is the same as 
your latest patch v19, except for some tiny comment kibitzing. I'll mark 
this as Ready for Committer in the commitfest app, and will try to 
commit it in the next couple of days.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: [HACKERS] Incomplete startup packet errors
Next
From: Ibrar Ahmed
Date:
Subject: Re: \describe*