Re: GIN improvements part 3: ordering in index - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: GIN improvements part 3: ordering in index
Date
Msg-id 51D84D93.3040709@fuzzy.cz
Whole thread Raw
In response to Re: GIN improvements part 3: ordering in index  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
Hi,

this is a follow-up to the message I posted to the thread about
additional info in GIN.

I've applied all three patches (ginaddinfo7.patch, gin_fast_scan.4.patch
and gin_ordering.4.patch) onto commit b8fd1a09. I ended up with two
definitions of ‘cmpEntries’ in ginget.c, but I suppose this is due to
split of the patch into multiple pieces. The definitions are exactly the
same so I've commented out the second one.

After applying fast scan the queries fail with 'buffer is not owned by
resource owner Portal' errors, the ordering patch causes segmentation
faults when loading the data.

Loading the data is basically a bunch of INSERT statements into
"messages" table, with a GIN index on the message body. So the table and
index are defined like this:

CREATE TABLE messages (

    id                SERIAL PRIMARY KEY,
    parent_id         INT REFERENCES messages(id),
    thread_id         INT,
    level             INT,
    hash_id           VARCHAR(32) NOT NULL UNIQUE,

    list              VARCHAR(32) NOT NULL REFERENCES lists(id),
    message_id        VARCHAR(200),
    in_reply_to       TEXT[],
    refs              TEXT[],
    sent              TIMESTAMP,

    subject           TEXT,
    author            TEXT,

    body_plain        TEXT,

    body_tsvector     tsvector,
    subject_tsvector  tsvector,

    headers           HSTORE,
    raw_message       TEXT
);

CREATE INDEX message_body_idx on messages using gin(body_tsvector);

I've observed about three failure scenarios:

1) autovacuum runs VACUUM on the 'messages' table and fails, killing
   all the connections, with this message in the server log

    LOG:  server process (PID 16611) was terminated by signal
          11: Segmentation fault
    DETAIL:  Failed process was running: autovacuum: ANALYZE
             public.messages


2) manual run of VACUUM on the table, with about the same result and
   this output on the console (and the same segfault in the server log)

    archie=# vacuum messages;
    WARNING:  relation "messages" page 6226 is uninitialized --- fixing
    WARNING:  relation "messages" page 6227 is uninitialized --- fixing
    WARNING:  relation "messages" page 6228 is uninitialized --- fixing
    WARNING:  relation "messages" page 6229 is uninitialized --- fixing
    WARNING:  relation "messages" page 6230 is uninitialized --- fixing
    WARNING:  relation "messages" page 6231 is uninitialized --- fixing
    WARNING:  relation "messages" page 6232 is uninitialized --- fixing
    WARNING:  relation "messages" page 6233 is uninitialized --- fixing
    The connection to the server was lost. Attempting reset: Failed.


3) disabled autovacuum, the load fails (always at exactly the same
   place) - I have collected a backtrace from gdb (after recompiling
   with disabled optimization), see the attachment.

All three scenarios might actually be caused by the same bug, as I've
checked the backtrace for the VACUUM and it fails at exactly the same
place as the third case.

regards
Tomas

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize
Next
From: Josh Berkus
Date:
Subject: Re: [9.4 CF 1] The Commitfest Slacker List