Home > mailing lists

Re: Index corruption / planner issue with one table in my pg 11.6 instance - Mailing list pgsql-hackers

From	Jeremy Finzel
Subject	Re: Index corruption / planner issue with one table in my pg 11.6 instance
Date	December 10, 2019 17:25:18
Msg-id	CAMa1XUjHNMMtMF=ZtvSb4uF1y=SXu+V9AmEWT+FtFfFs3SLrTg@mail.gmail.com Whole thread Raw
In response to	Re: Index corruption / planner issue with one table in my pg 11.6 instance (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Index corruption / planner issue with one table in my pg 11.6 instance (Jeremy Finzel <finzelj@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Dec 10, 2019 at 12:09 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah. The reported behavior can mostly be explained if we assume
that there's some HOT chain in the table that involves an update
of this particular column, so that if we build an index on that
column we see a broken HOT chain, but building an index on some
other column doesn't have a problem.

The problem exists so far as I can tell on indexing *any column* of *this particular table*. I tried same experiment on another table in the same replication stream, and I cannot reproduce it.

I am building the index **non-concurrently** every time.

The thing this doesn't easily explain is that the behavior persists
across repeated index rebuilds. A broken HOT chain is only broken
as long as the older entry is still visible-to-somebody, so that
such situations ought to be self-healing as time passes. If it
fails repeatedly, this theory requires assuming that either

1. You've got some extremely old open transactions (maybe forgotten
prepared transactions?), or

No prepared_xacts and no transactions older than a few hours. Several hour transactions are common in this reporting system. I have not yet seen if after several hours the index starts showing up in plans.

2. Your workload is constantly generating new broken HOT chains of
the same sort, so that there's usually a live one when you try
to build an index.

The fact that you even notice the indcheckxmin restriction indicates
that you do tend to have long-running transactions in the system,
else the index would come free for use fairly quickly. So #1 isn't
as implausible as I might otherwise think. But #2 seems probably
more likely on the whole. OTOH, neither point is exactly within
the offered evidence.

Is there a way for me to test this theory? I tried the following with no change in behavior:

Disable write load to table
Vacuum analyze table (not vac full)
Create index
Explain

Still did not pick up the index.

Thanks,

Jeremy

pgsql-hackers by date:

From: Pavel Stehule
Date: 10 December 2019, 16:47:03
Subject: Re: proposal: minscale, rtrim, btrim functions for numeric

From: Asif Rehman
Date: 10 December 2019, 17:33:48
Subject: Re: WIP/PoC for parallel backup

Re: Index corruption / planner issue with one table in my pg 11.6 instance - Mailing list pgsql-hackers

Previous

Next