Re: hung backends stuck in spinlock heavy endless loop - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: hung backends stuck in spinlock heavy endless loop
Date
Msg-id CAHyXU0yA4K82XGW5FZ6N5T8f-RfCixRQaUVCGMbUVKX6zf8ORg@mail.gmail.com
Whole thread Raw
In response to Re: hung backends stuck in spinlock heavy endless loop  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: hung backends stuck in spinlock heavy endless loop
List pgsql-hackers
On Wed, Jan 14, 2015 at 6:26 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>>> (gdb) print  BufferGetBlockNumber(buf)
>>> $15 = 9
>>>
>>> ..and it stays 9, continuing several times having set breakpoint.
>>
>>
>> And the index involved? I'm pretty sure that this in an internal page, no?
>
> The index is the oid index on pg_class.  Some more info:
>
> *) temp table churn is fairly high.  Several dozen get spawned and
> destroted at the start of a replication run, all at once, due to some
> dodgy coding via dblink.  During the replication run, the temp table
> churn rate drops.
>
> *)  running btreecheck, I see:
> cds2=# select bt_index_verify('pg_class_oid_index');
> NOTICE:  page 7 of index "pg_class_oid_index" is deleted
> NOTICE:  page 10 of index "pg_class_oid_index" is deleted
> NOTICE:  page 12 of index "pg_class_oid_index" is deleted
>  bt_index_verify
> ─────────────────
>
>
> cds2=# select bt_leftright_verify('pg_class_oid_index');
> WARNING:  left link/right link pair don't comport at level 0, block 9,
> last: 2, current left: 4
> WARNING:  left link/right link pair don't comport at level 0, block 9,
> last: 9, current left: 4
> WARNING:  left link/right link pair don't comport at level 0, block 9,
> last: 9, current left: 4
> WARNING:  left link/right link pair don't comport at level 0, block 9,
> last: 9, current left: 4
> WARNING:  left link/right link pair don't comport at level 0, block 9,
> last: 9, current left: 4
> [repeat infinity until cancel]
>
> which looks like the index is corrupted?  ISTM _bt_moveright is
> hanging because it's trying to move from block 9 to block 9 and so
> loops forever.

per Peter the following might be useful:

cds2=# select * from bt_metap('pg_class_oid_index');
 magic  │ version │ root │ level │ fastroot │ fastlevel
────────┼─────────┼──────┼───────┼──────────┼───────────
 340322 │       2 │    3 │     1 │        3 │         1

cds2=# select (bt_page_stats('pg_class_oid_index', s)).* from
generate_series(1,12) s;
 blkno │ type │ live_items │ dead_items │ avg_item_size │ page_size │
free_size │ btpo_prev │ btpo_next │ btpo  │ btpo_flags

───────┼──────┼────────────┼────────────┼───────────────┼───────────┼───────────┼───────────┼───────────┼───────┼────────────
     1 │ l    │        119 │          0 │            16 │      8192 │
    5768 │         0 │         4 │     0 │          1
     2 │ l    │         25 │          0 │            16 │      8192 │
    7648 │         4 │         9 │     0 │          1
     3 │ r    │          8 │          0 │            15 │      8192 │
    7996 │         0 │         0 │     1 │          2
     4 │ l    │        178 │          0 │            16 │      8192 │
    4588 │         1 │         2 │     0 │          1
     5 │ l    │          7 │          0 │            16 │      8192 │
    8008 │         9 │        11 │     0 │          1
     6 │ l    │          5 │          0 │            16 │      8192 │
    8048 │        11 │         8 │     0 │          1
     7 │ d    │          0 │          0 │             0 │      8192 │
       0 │        -1 │        -1 │ 12366 │          0
     8 │ l    │        187 │          0 │            16 │      8192 │
    4408 │         6 │         0 │     0 │          1
     9 │ l    │         25 │          0 │            16 │      8192 │
    7648 │         4 │         9 │     0 │          1
    10 │ d    │          0 │          0 │             0 │      8192 │
       0 │        -1 │        -1 │ 12366 │          0
    11 │ l    │          6 │          0 │            16 │      8192 │
    8028 │         5 │         6 │     0 │          1
    12 │ d    │          0 │          0 │             0 │      8192 │
       0 │        -1 │        -1 │ 10731 │          0


merlin

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: hung backends stuck in spinlock heavy endless loop
Next
From: Andrew Dunstan
Date:
Subject: Re: Shouldn't CREATE TABLE LIKE copy the relhasoids property?