Re: [HACKERS] ginInsertCleanup called from vacuum could still misstuples to be deleted - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [HACKERS] ginInsertCleanup called from vacuum could still misstuples to be deleted
Date
Msg-id CAMkU=1zeCEVOWY_+vYMfcj8gC0BrL5caMvXO40c0PBttoM5Yzg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] ginInsertCleanup called from vacuum could still misstuples to be deleted  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Nov 16, 2017 at 12:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Nov 16, 2017 at 7:08 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Agreed, that's better. Attached updated patch.
> Also I've added this to the next CF so as not to forget.

Committed and back-patched.  While I'm fairly sure this is a correct
fix, post-commit review from someone who knows GIN better than I do
would be a great idea.

I am also clear on what the consequences of this bug are.  It seems
like it could harm insert performance by making us wait when we
shouldn't, but can it cause corruption?  That I'm not sure about.  If
there's already a cleanup of the pending list in progress, how do
things go wrong?


 
It can cause corruption.  Attached is a test case to show it.  It is based on my usual crash-recovery test, but to get corruption I had to turn off the crash-injection (which means the test can be run on an unpatched server) and increase gin_pending_list_limit.

On 8 CPU, it took anywhere from 15 minutes to 8 hours to see corruption, which presents as something like this in the script output:

child abnormal exit update did not update 1 row: key 8117 updated 2 at count.pl line 193.\n  at count.pl line 201.

Your commit has apparently fixed it, but I will run some more extensive tests.

The source of the corruption is this:

A tuple is inserted into the pending list.  It is quickly updated again, or deleted, and then passes the dead-to-all horizon, so is eligible to vacuum.

A user back end takes the lock to start cleaning the pending list.

A vacuum back end attempts the lock, but due to the bug, it does so conditionally and then decides it is done when that fails, and starts the index vacuum proper.  

The user back end reaches the dead tuple in the pending list, and inserts into the main part of the index, in a part that the vacuum has already passed over.

The vacuum goes back to the table and marks the tid slot as re-usable, even though there is still an index tuple pointing to it.

Cheers,

Jeff

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Amit Kapila
Date:
Subject: Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com