Thread: Suggest note in index documentation about long running transactions

Suggest note in index documentation about long running transactions

From

Chris Travers

Date:

15 February 2016, 22:22:16

Hi;

Today I ran into a question from a client as to why an index was not used. The index had been freshly created and was on a relatively small table (16k live rows, but 300k dead tuples). The resulting sequential scan was taking half a second.

I found that even when setting enable_seqscan to off it was still refusing to use the index. After reading carefully through the index documentation yet again, it was not clear why it was not used.

After much research I came across an email by Tom Lane about how the HOT enhancements in 8.3 meant that indexes might not be usable until after the longest running transaction committed. This turned out to be the culpret (we had a transaction that took about 15 hours to complete and when it committed the index was used).

It might help if there is a note that indexes in some cases cannot be used until the min xid advances to the point where the index was created.

Best Wishes,

Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.

http://www.efficito.com/learn_more

Re: Suggest note in index documentation about long running transactions

From

Chris Travers

Date:

16 February 2016, 11:30:33

Unless there is a sense that this is a bad idea I will submit a doc patch.

On Mon, Feb 15, 2016 at 8:22 PM, Chris Travers <chris.travers@gmail.com> wrote:

Hi;

Today I ran into a question from a client as to why an index was not used. The index had been freshly created and was on a relatively small table (16k live rows, but 300k dead tuples). The resulting sequential scan was taking half a second.

I found that even when setting enable_seqscan to off it was still refusing to use the index. After reading carefully through the index documentation yet again, it was not clear why it was not used.

After much research I came across an email by Tom Lane about how the HOT enhancements in 8.3 meant that indexes might not be usable until after the longest running transaction committed. This turned out to be the culpret (we had a transaction that took about 15 hours to complete and when it committed the index was used).

It might help if there is a note that indexes in some cases cannot be used until the min xid advances to the point where the index was created.

--
Best Wishes,
Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
http://www.efficito.com/learn_more

Best Wishes,

Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.

http://www.efficito.com/learn_more

Re: Suggest note in index documentation about long running transactions

From

Tom Lane

Date:

16 February 2016, 18:54:48

Chris Travers <chris.travers@gmail.com> writes:
> Unless there is a sense that this is a bad idea I will submit a doc patch.

I was already working on it ... I think what we want is something along
this line in the "Building Indexes Concurrently" section of the CREATE
INDEX ref page:

*** create_index.sgml.orig     Tue Feb  2 10:56:09 2016
--- create_index.sgml       Tue Feb 16 10:48:38 2016
*************** CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ]
*** 443,448 ****
--- 443,452 ----
      scan starts.   Concurrent index creation serially waits for each old
      transaction to complete using the method outlined in section <xref
      linkend="view-pg-locks">.
+     In addition, once the <command>CREATE INDEX CONCURRENTLY</> command does
+     complete, the index is still not immediately ready for use: it cannot
+     be used until all transactions that were active when index creation
+     started are gone.
     </para>

     <para>

However, now that I look at it, the existing text in this para is kind of
a mess too.  That cross-reference to pg_locks, for example, seems
gratuitous and confusing (because what it's pointing you at is a single
sentence deeply buried in a rather long page).  It's also not very clear
what is meant by "Any transaction active when the second table scan starts
can block concurrent index creation until it completes"; I think we need
to be a little clearer about when that happens or doesn't happen.

            regards, tom lane

Re: Suggest note in index documentation about long running transactions

From

Tom Lane

Date:

16 February 2016, 20:04:32

I wrote:
> Chris Travers <chris.travers@gmail.com> writes:
>> Unless there is a sense that this is a bad idea I will submit a doc patch.

> I was already working on it ... I think what we want is something along
> this line in the "Building Indexes Concurrently" section of the CREATE
> INDEX ref page:

After further perusal of the code I propose replacing that para with this
wording:

   <para>
    In a concurrent index build, the index is actually entered into
    the system catalogs in one transaction, then two table scans occur in
    two more transactions.  Before each table scan, the index build must
    wait for existing transactions that have modified the table to terminate.
    After the second scan, the index build must wait for any transactions
    that have a snapshot (see <xref linkend="mvcc">) predating the second
    scan to terminate.  Then finally the index can be marked ready for use,
    and the <command>CREATE INDEX</> command terminates.
    Even then, however, the index may not be immediately usable for queries:
    in the worst case, it cannot be used as long as transactions exist that
    predate the start of the index build.
   </para>

This is a good deal clearer, IMO, about the conditions under which
transactions block CREATE INDEX CONCURRENTLY; we need that because
otherwise it's not going to make much sense to talk about old
transactions still existing after the command completes.

The reason for the "worst case" weasel-wording is that the problem you
saw doesn't actually occur unless the index build detected some broken
HOT chains.  I do not want to get into explaining what those are here,
so it seemed best to just be vague about whether there's a delay in
index usability or not.

(BTW, I wondered whether this wasn't just a bug and we should make things
less confusing by having CREATE INDEX CONCURRENTLY not complete until
the index is fully usable.  However, it appears the reason we don't do
that is it would create a risk of two CREATE INDEX CONCURRENTLY commands
deadlocking, ie they'd each think they have to wait for the other one.)

            regards, tom lane

Re: Suggest note in index documentation about long running transactions

From

Chris Travers

Date:

16 February 2016, 20:24:23

On Tue, Feb 16, 2016 at 6:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:
> Chris Travers <chris.travers@gmail.com> writes:
>> Unless there is a sense that this is a bad idea I will submit a doc patch.

> I was already working on it ... I think what we want is something along
> this line in the "Building Indexes Concurrently" section of the CREATE
> INDEX ref page:

After further perusal of the code I propose replacing that para with this
wording:

<para>
In a concurrent index build, the index is actually entered into
the system catalogs in one transaction, then two table scans occur in
two more transactions. Before each table scan, the index build must
wait for existing transactions that have modified the table to terminate.
After the second scan, the index build must wait for any transactions
that have a snapshot (see <xref linkend="mvcc">) predating the second
scan to terminate. Then finally the index can be marked ready for use,
and the <command>CREATE INDEX</> command terminates.
Even then, however, the index may not be immediately usable for queries:
in the worst case, it cannot be used as long as transactions exist that
predate the start of the index build.
</para>

This is a good deal clearer, IMO, about the conditions under which
transactions block CREATE INDEX CONCURRENTLY; we need that because
otherwise it's not going to make much sense to talk about old
transactions still existing after the command completes.

Agreed.

The reason for the "worst case" weasel-wording is that the problem you
saw doesn't actually occur unless the index build detected some broken
HOT chains. I do not want to get into explaining what those are here,
so it seemed best to just be vague about whether there's a delay in
index usability or not.

I think that is also a good optimization, documentation wise.

(BTW, I wondered whether this wasn't just a bug and we should make things
less confusing by having CREATE INDEX CONCURRENTLY not complete until
the index is fully usable. However, it appears the reason we don't do
that is it would create a risk of two CREATE INDEX CONCURRENTLY commands
deadlocking, ie they'd each think they have to wait for the other one.)

I think even without the deadlocking that would be far worse than the current behavior. The current behavior is a bit opaque when it happens (and in this case I could certainly see HOT chains being a problem case on this db as it is large, but also that this specific table has tremendous turnover, and a few transactions which read from the table can be extremely long running -- have shortened the longest running case from about 4 days to about 18 hours -- don't ask).

If you do anything, raising a NOTICE that the index is deferred for usability might be a good thing, but the problems with delaying exit go well beyond deadlocks.

regards, tom lane

Best Wishes,

Chris Travers

Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.

http://www.efficito.com/learn_more