Re: [GENERAL] gin index trouble - Mailing list pgsql-general

From Rob Sargent
Subject Re: [GENERAL] gin index trouble
Date
Msg-id 07d66523-c7ce-48e0-b436-4a027a1619cd@gmail.com
Whole thread Raw
In response to Re: [GENERAL] gin index trouble  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-general



On 10/30/2017 10:56 AM, Peter Geoghegan wrote:
On Mon, Oct 30, 2017 at 9:45 AM, Rob Sargent <robjsargent@gmail.com> wrote:
Peter, you beat me to the punch.  I was just about to say "Having read the
referenced message I thought I would add that we never delete from this
table."  In this particular case it was written to record by record, in a
previous execution and at the time of the error it was only being read.  (In
case you've been following, the failed execution would have added ~1M
"segments", each which references an entry in the gin'd table "probandsets"
- but like a rookie I'm looking up each probandset(2^16) individually.
Re-working that NOW.)
It's not surprising that only a SELECT statement could see this
problem. I guess that it's possible that only page deletions used for
the pending list are involved here.

I'm not sure how reliably you can recreate the problem, but if it
doesn't take too long then it would be worth seeing what effect
turning off the FASTUPDATE storage parameter for the GIN index has.
That could prevent the problem from recurring, and would support my
theory about what's up here. (It wouldn't fix the corruption, though.)

Of course, what I'd much prefer is a self-contained test case. But if
you can't manage that, or if reproducing the issue takes hours, then
this simpler experiment might be worthwhile.

My test database machine is:
Not virtual
Intel(R) Xeon(R) CPU E3-1241 v3 @ 3.50GHz (quad core, hyperthreaded)
MemTotal:       16272548 kB
default postgres.conf from yum install postgresql-10*
I've loaded thrice the number of records (190K) into the problem table, but no sign yet of the problem. But unlike the production lookup-notfind-insert (anti)pattern, these were all loaded in a single transaction. I think the following query has to read the gin'd column of every record:
select array_length(probands,1) as heads,
       count(*) as occurs
from tld.probandset
where probands @> '{65fe3b60-1c86-4b14-a85d-21abdf68f9e2,f0963403-3f3c-426d-a828-b5bfff914bb4}' group by array_length(probands,1)
order by array_length(probands,1);

 heads | occurs
-------+--------
     2 |      1
     3 |     14
     4 |     91
     5 |    364
     6 |   1001
     7 |   2002
     8 |   3003
     9 |   3432
    10 |   3003
    11 |   2002
    12 |   1001
    13 |    364
    14 |     91
    15 |     14
    16 |      1
(15 rows)

Time: 17.125 ms
Happy as a clam.

I'll try a run of the antipattern.  I have NOT diddled
FASTUPDATE at all.



pgsql-general by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: [GENERAL] EXPLAIN command just hangs...
Next
From: Rhhh Lin
Date:
Subject: Re: [GENERAL] EXPLAIN command just hangs...