Re: hung backends stuck in spinlock heavy endless loop - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: hung backends stuck in spinlock heavy endless loop |
Date | |
Msg-id | CAHyXU0yA4K82XGW5FZ6N5T8f-RfCixRQaUVCGMbUVKX6zf8ORg@mail.gmail.com Whole thread Raw |
In response to | Re: hung backends stuck in spinlock heavy endless loop (Merlin Moncure <mmoncure@gmail.com>) |
Responses |
Re: hung backends stuck in spinlock heavy endless loop
|
List | pgsql-hackers |
On Wed, Jan 14, 2015 at 6:26 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Wed, Jan 14, 2015 at 5:39 PM, Peter Geoghegan <pg@heroku.com> wrote: >> On Wed, Jan 14, 2015 at 3:38 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >>> (gdb) print BufferGetBlockNumber(buf) >>> $15 = 9 >>> >>> ..and it stays 9, continuing several times having set breakpoint. >> >> >> And the index involved? I'm pretty sure that this in an internal page, no? > > The index is the oid index on pg_class. Some more info: > > *) temp table churn is fairly high. Several dozen get spawned and > destroted at the start of a replication run, all at once, due to some > dodgy coding via dblink. During the replication run, the temp table > churn rate drops. > > *) running btreecheck, I see: > cds2=# select bt_index_verify('pg_class_oid_index'); > NOTICE: page 7 of index "pg_class_oid_index" is deleted > NOTICE: page 10 of index "pg_class_oid_index" is deleted > NOTICE: page 12 of index "pg_class_oid_index" is deleted > bt_index_verify > ───────────────── > > > cds2=# select bt_leftright_verify('pg_class_oid_index'); > WARNING: left link/right link pair don't comport at level 0, block 9, > last: 2, current left: 4 > WARNING: left link/right link pair don't comport at level 0, block 9, > last: 9, current left: 4 > WARNING: left link/right link pair don't comport at level 0, block 9, > last: 9, current left: 4 > WARNING: left link/right link pair don't comport at level 0, block 9, > last: 9, current left: 4 > WARNING: left link/right link pair don't comport at level 0, block 9, > last: 9, current left: 4 > [repeat infinity until cancel] > > which looks like the index is corrupted? ISTM _bt_moveright is > hanging because it's trying to move from block 9 to block 9 and so > loops forever. per Peter the following might be useful: cds2=# select * from bt_metap('pg_class_oid_index'); magic │ version │ root │ level │ fastroot │ fastlevel ────────┼─────────┼──────┼───────┼──────────┼─────────── 340322 │ 2 │ 3 │ 1 │ 3 │ 1 cds2=# select (bt_page_stats('pg_class_oid_index', s)).* from generate_series(1,12) s; blkno │ type │ live_items │ dead_items │ avg_item_size │ page_size │ free_size │ btpo_prev │ btpo_next │ btpo │ btpo_flags ───────┼──────┼────────────┼────────────┼───────────────┼───────────┼───────────┼───────────┼───────────┼───────┼──────────── 1 │ l │ 119 │ 0 │ 16 │ 8192 │ 5768 │ 0 │ 4 │ 0 │ 1 2 │ l │ 25 │ 0 │ 16 │ 8192 │ 7648 │ 4 │ 9 │ 0 │ 1 3 │ r │ 8 │ 0 │ 15 │ 8192 │ 7996 │ 0 │ 0 │ 1 │ 2 4 │ l │ 178 │ 0 │ 16 │ 8192 │ 4588 │ 1 │ 2 │ 0 │ 1 5 │ l │ 7 │ 0 │ 16 │ 8192 │ 8008 │ 9 │ 11 │ 0 │ 1 6 │ l │ 5 │ 0 │ 16 │ 8192 │ 8048 │ 11 │ 8 │ 0 │ 1 7 │ d │ 0 │ 0 │ 0 │ 8192 │ 0 │ -1 │ -1 │ 12366 │ 0 8 │ l │ 187 │ 0 │ 16 │ 8192 │ 4408 │ 6 │ 0 │ 0 │ 1 9 │ l │ 25 │ 0 │ 16 │ 8192 │ 7648 │ 4 │ 9 │ 0 │ 1 10 │ d │ 0 │ 0 │ 0 │ 8192 │ 0 │ -1 │ -1 │ 12366 │ 0 11 │ l │ 6 │ 0 │ 16 │ 8192 │ 8028 │ 5 │ 6 │ 0 │ 1 12 │ d │ 0 │ 0 │ 0 │ 8192 │ 0 │ -1 │ -1 │ 10731 │ 0 merlin
Attachment
pgsql-hackers by date: