Thread: GIN logging GIN_SEGMENT_UNMODIFIED actions?
Hi, trying to debug something I saw the following in pg_xlogdump output: rmgr: Gin len (rec/tot): 0/ 274, tx: 0, lsn: 1C/DF28AEB0, prev 1C/DF289858, desc: VACUUM_DATA_LEAF_PAGE 3 segments: 5 unknown action 0 ???, blkref #0: rel 1663/16384/16435 blk 310982 note the "segments: 5 unknown action 0 ???" bit. That doesn't seem right, given: #define GIN_SEGMENT_UNMODIFIED 0 /* no action (not used in WAL records) */ Greetings, Andres Freund
trying to debug something I saw the following in pg_xlogdump output:
rmgr: Gin len (rec/tot): 0/ 274, tx: 0, lsn: 1C/DF28AEB0, prev 1C/DF289858, desc: VACUUM_DATA_LEAF_PAGE 3 segments: 5 unknown action 0 ???, blkref #0: rel 1663/16384/16435 blk 310982
note the "segments: 5 unknown action 0 ???" bit. That doesn't seem
right, given:
#define GIN_SEGMENT_UNMODIFIED 0 /* no action (not used in WAL records) */
I've checked GIN code. Have no idea of how such wal record could be generated...
The only idea I have is to add check that we're inserting valid WAL record immediately before XLogInsert().
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Tue, May 10, 2016 at 9:57 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > Hi! > > On Mon, May 9, 2016 at 10:46 PM, Andres Freund <andres@anarazel.de> wrote: >> >> trying to debug something I saw the following in pg_xlogdump output: >> >> rmgr: Gin len (rec/tot): 0/ 274, tx: 0, lsn: >> 1C/DF28AEB0, prev 1C/DF289858, desc: VACUUM_DATA_LEAF_PAGE 3 segments: 5 >> unknown action 0 ???, blkref #0: rel 1663/16384/16435 blk 310982 >> >> note the "segments: 5 unknown action 0 ???" bit. That doesn't seem >> right, given: >> #define GIN_SEGMENT_UNMODIFIED 0 /* no action (not used in >> WAL records) */ > > > I've checked GIN code. Have no idea of how such wal record could be > generated... I encountered the same issue when executing the following queries and running pg_xlogdump. CREATE EXTENSION pg_trgm; CREATE TABLE test (col1 TEXT); CREATE INDEX testidx ON test USING gin (col1 gin_trgm_ops) WITH (fastupdate = off); INSERT INTO test SELECT 'ABCDE' FROM generate_series(1,10000); DELETE FROM test; VACUUM test; $ pg_xlogdump data/pg_xlog/000000010000000000000004 | grep Gin | grep action rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B468, prev 0/04A4B438, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 11 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B4C0, prev 0/04A4B468, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 10 rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B510, prev 0/04A4B4C0, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 13 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B568, prev 0/04A4B510, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 12 rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B5B8, prev 0/04A4B568, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 15 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B610, prev 0/04A4B5B8, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 14 rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B660, prev 0/04A4B610, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 17 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B6B8, prev 0/04A4B660, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 16 rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B708, prev 0/04A4B6B8, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 19 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B760, prev 0/04A4B708, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 18 rmgr: Gin len (rec/tot): 0/ 88, tx: 0, lsn: 0/04A4B7B0, prev 0/04A4B760, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 21 rmgr: Gin len (rec/tot): 0/ 78, tx: 0, lsn: 0/04A4B808, prev 0/04A4B7B0, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 20 Regards, -- Fujii Masao
On Fri, Aug 26, 2016 at 11:35 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, May 10, 2016 at 9:57 PM, Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: >> Hi! >> >> On Mon, May 9, 2016 at 10:46 PM, Andres Freund <andres@anarazel.de> wrote: >>> >>> trying to debug something I saw the following in pg_xlogdump output: >>> >>> rmgr: Gin len (rec/tot): 0/ 274, tx: 0, lsn: >>> 1C/DF28AEB0, prev 1C/DF289858, desc: VACUUM_DATA_LEAF_PAGE 3 segments: 5 >>> unknown action 0 ???, blkref #0: rel 1663/16384/16435 blk 310982 >>> >>> note the "segments: 5 unknown action 0 ???" bit. That doesn't seem >>> right, given: >>> #define GIN_SEGMENT_UNMODIFIED 0 /* no action (not used in >>> WAL records) */ >> >> >> I've checked GIN code. Have no idea of how such wal record could be >> generated... > > I encountered the same issue when executing the following queries and > running pg_xlogdump. ISTM that the cause of this issue is that gin_desc() uses XLogRecGetData() to extract ginxlogVacuumDataLeafPage data from XLOG_GIN_VACUUM_DATA_LEAF_PAGE record. Since it's registered by XLogRegisterBufData() in ginVacuumPostingTreeLeaf(), XLogRecGetBlockData() should be used, instead. Patch attached. Thought? BTW, in REDO side, ginRedoVacuumDataLeafPage() uses XLogRecGetBlockData() to extract ginxlogVacuumDataLeafPage data. Regards, -- Fujii Masao
Attachment
Fujii Masao <masao.fujii@gmail.com> writes: > ISTM that the cause of this issue is that gin_desc() uses XLogRecGetData() to > extract ginxlogVacuumDataLeafPage data from XLOG_GIN_VACUUM_DATA_LEAF_PAGE > record. Since it's registered by XLogRegisterBufData() in > ginVacuumPostingTreeLeaf(), > XLogRecGetBlockData() should be used, instead. Patch attached. Thought? I think we probably have more issues than that. See for example https://www.postgresql.org/message-id/flat/20160826072658.15676.7628%40wrigleys.postgresql.org which clearly shows that the replay logic is seeing something wrong too: 2016-08-26 06:01:50 UTC FATAL: unexpected GIN leaf action: 0 2016-08-26 06:01:50 UTC CONTEXT: xlog redo Insert item, node: 1663/16387/33108 blkno: 6622 isdata: T isleaf: T 3 segments: 2 (add 0 items) 0 unknown action 0 ??? If it were just a matter of gin_desc() being wrong, we'd not have gotten such a failure. (Which is not to say that gin_desc() isn't wrong; it may well be.) regards, tom lane
On Tue, Aug 30, 2016 at 3:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> ISTM that the cause of this issue is that gin_desc() uses XLogRecGetData() to >> extract ginxlogVacuumDataLeafPage data from XLOG_GIN_VACUUM_DATA_LEAF_PAGE >> record. Since it's registered by XLogRegisterBufData() in >> ginVacuumPostingTreeLeaf(), >> XLogRecGetBlockData() should be used, instead. Patch attached. Thought? > > I think we probably have more issues than that. See for example > https://www.postgresql.org/message-id/flat/20160826072658.15676.7628%40wrigleys.postgresql.org > > which clearly shows that the replay logic is seeing something wrong too: > > 2016-08-26 06:01:50 UTC FATAL: unexpected GIN leaf action: 0 > 2016-08-26 06:01:50 UTC CONTEXT: xlog redo Insert item, node: > 1663/16387/33108 blkno: 6622 isdata: T isleaf: T 3 segments: 2 (add 0 items) > 0 unknown action 0 ??? > > If it were just a matter of gin_desc() being wrong, we'd not have > gotten such a failure. (Which is not to say that gin_desc() isn't > wrong; it may well be.) Yeah, I got the pg_xlogdump issue while I was trying to reproduce the above reported problem. Fixing pg_xlogdump would be helpful for the analysis of that problem. Regards, -- Fujii Masao
On Tue, Aug 30, 2016 at 3:13 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, Aug 30, 2016 at 3:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Fujii Masao <masao.fujii@gmail.com> writes: >>> ISTM that the cause of this issue is that gin_desc() uses XLogRecGetData() to >>> extract ginxlogVacuumDataLeafPage data from XLOG_GIN_VACUUM_DATA_LEAF_PAGE >>> record. Since it's registered by XLogRegisterBufData() in >>> ginVacuumPostingTreeLeaf(), >>> XLogRecGetBlockData() should be used, instead. Patch attached. Thought? >> >> I think we probably have more issues than that. See for example >> https://www.postgresql.org/message-id/flat/20160826072658.15676.7628%40wrigleys.postgresql.org >> >> which clearly shows that the replay logic is seeing something wrong too: >> >> 2016-08-26 06:01:50 UTC FATAL: unexpected GIN leaf action: 0 >> 2016-08-26 06:01:50 UTC CONTEXT: xlog redo Insert item, node: >> 1663/16387/33108 blkno: 6622 isdata: T isleaf: T 3 segments: 2 (add 0 items) >> 0 unknown action 0 ??? >> >> If it were just a matter of gin_desc() being wrong, we'd not have >> gotten such a failure. (Which is not to say that gin_desc() isn't >> wrong; it may well be.) > > Yeah, I got the pg_xlogdump issue while I was trying to reproduce > the above reported problem. Fixing pg_xlogdump would be helpful for > the analysis of that problem. Attached is the updated version of the patch. I found that pg_xlogdump code for XLOG_GIN_INSERT record with GIN_INSERT_ISLEAF flag has the same issue, i.e., "unknown action 0" error is thrown for that record. The latest patch fixes this. Regards, -- Fujii Masao
Attachment
Fujii Masao <masao.fujii@gmail.com> writes: > I found that pg_xlogdump code for XLOG_GIN_INSERT record with > GIN_INSERT_ISLEAF flag has the same issue, i.e., > "unknown action 0" error is thrown for that record. > The latest patch fixes this. Hmm, comparing gin_desc() to ginRedoInsert() makes me think there are more problems there than that one. Aren't the references to "payload" wrong in all three branches of that "if" construct, not just the middle one? I'm inclined to suspect we should restructure this more like { ginxlogInsert *xlrec = (ginxlogInsert *) rec; - char *payload = rec + sizeof(ginxlogInsert); appendStringInfo(buf, "isdata: %c isleaf: %c", (xlrec->flags & GIN_INSERT_ISDATA)? 'T' : 'F', (xlrec->flags & GIN_INSERT_ISLEAF) ? 'T' : 'F'); if (!(xlrec->flags & GIN_INSERT_ISLEAF)) { + char *payload = rec + sizeof(ginxlogInsert); BlockNumber leftChildBlkno; BlockNumber rightChildBlkno; leftChildBlkno = BlockIdGetBlockNumber((BlockId) payload); payload += sizeof(BlockIdData); rightChildBlkno = BlockIdGetBlockNumber((BlockId) payload); payload+= sizeof(BlockNumber); appendStringInfo(buf, " children: %u/%u", leftChildBlkno, rightChildBlkno); } + if (XLogRecHasBlockImage(record, 0)) + appendStringInfoString(buf, " (full page image)"); + else + { + char *payload = XLogRecGetBlockData(record, 0, NULL); + if (!(xlrec->flags & GIN_INSERT_ISDATA)) appendStringInfo(buf, " isdelete: %c", (((ginxlogInsertEntry *) payload)->isDelete) ? 'T' : 'F'); ... etc etc ... regards, tom lane
On Wed, Aug 31, 2016 at 12:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> I found that pg_xlogdump code for XLOG_GIN_INSERT record with >> GIN_INSERT_ISLEAF flag has the same issue, i.e., >> "unknown action 0" error is thrown for that record. >> The latest patch fixes this. > > Hmm, comparing gin_desc() to ginRedoInsert() makes me think there are more > problems there than that one. Aren't the references to "payload" wrong > in all three branches of that "if" construct, not just the middle one? > I'm inclined to suspect we should restructure this more like > > { > ginxlogInsert *xlrec = (ginxlogInsert *) rec; > - char *payload = rec + sizeof(ginxlogInsert); > > appendStringInfo(buf, "isdata: %c isleaf: %c", > (xlrec->flags & GIN_INSERT_ISDATA) ? 'T' : 'F', > (xlrec->flags & GIN_INSERT_ISLEAF) ? 'T' : 'F'); > if (!(xlrec->flags & GIN_INSERT_ISLEAF)) > { > + char *payload = rec + sizeof(ginxlogInsert); > BlockNumber leftChildBlkno; > BlockNumber rightChildBlkno; > > leftChildBlkno = BlockIdGetBlockNumber((BlockId) payload); > payload += sizeof(BlockIdData); > rightChildBlkno = BlockIdGetBlockNumber((BlockId) payload); > payload += sizeof(BlockNumber); > appendStringInfo(buf, " children: %u/%u", > leftChildBlkno, rightChildBlkno); > } > + if (XLogRecHasBlockImage(record, 0)) > + appendStringInfoString(buf, " (full page image)"); > + else > + { > + char *payload = XLogRecGetBlockData(record, 0, NULL); > + > if (!(xlrec->flags & GIN_INSERT_ISDATA)) > appendStringInfo(buf, " isdelete: %c", > (((ginxlogInsertEntry *) payload)->isDelete) ? 'T' : 'F'); > ... etc etc ... If we do this, the extra information like ginxlogInsertEntry->isDelete will never be reported when the record has FPW. This is OK in terms of REDO because REDO logic just restores FPW and doesn't see isDelete in that case. However if gin_desc() was initially implemented to dump any information contained in WAL record as much as possible even when it had FPW, we should not do such restructure. Not sure the initial intention for that. For the purpose of debugging WAL generation code, I think that it's better to dump even information that REDO logic doesn't use. BTW, whether the record has FPW or not is reported in other places like XLogDumpDisplayRecord() and xlog_outrec(). So we can check whether FPW is contained or not, from the output of pg_xlogdump, even if gin_desc doesn't report "(full page image)". Regards, -- Fujii Masao
Fujii Masao <masao.fujii@gmail.com> writes: > On Wed, Aug 31, 2016 at 12:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hmm, comparing gin_desc() to ginRedoInsert() makes me think there are more >> problems there than that one. Aren't the references to "payload" wrong >> in all three branches of that "if" construct, not just the middle one? > If we do this, the extra information like ginxlogInsertEntry->isDelete will > never be reported when the record has FPW. I'm happy to have it print whatever is there, but are you sure that the information is even there? I thought that this chunk of the WAL record would get optimized away if we write an FPW image instead. regards, tom lane
On Wed, Aug 31, 2016 at 8:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> On Wed, Aug 31, 2016 at 12:10 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Hmm, comparing gin_desc() to ginRedoInsert() makes me think there are more >>> problems there than that one. Aren't the references to "payload" wrong >>> in all three branches of that "if" construct, not just the middle one? > >> If we do this, the extra information like ginxlogInsertEntry->isDelete will >> never be reported when the record has FPW. > > I'm happy to have it print whatever is there, but are you sure that the > information is even there? I thought that this chunk of the WAL record > would get optimized away if we write an FPW image instead. I was thinking that optimization logic was changed for logical decoding. This understanding is right for heap, but not right for GIN, i.e., you're right, such data chunk for GIN would be removed from WAL record if FPW is taken. I applied your suggested changes into the patch. Patch attached. Regards, -- Fujii Masao
Attachment
Fujii Masao <masao.fujii@gmail.com> writes: > I applied your suggested changes into the patch. Patch attached. That looks pretty sane to me (but I just eyeballed it, didn't test). One further minor improvement would be to rearrange the XLOG_GIN_VACUUM_DATA_LEAF_PAGE case so that we don't bother calling XLogRecGetBlockData() if there's a full-page image. regards, tom lane
On Thu, Sep 1, 2016 at 10:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii@gmail.com> writes: >> I applied your suggested changes into the patch. Patch attached. > > That looks pretty sane to me (but I just eyeballed it, didn't test). > > One further minor improvement would be to rearrange the > XLOG_GIN_VACUUM_DATA_LEAF_PAGE case so that we don't bother calling > XLogRecGetBlockData() if there's a full-page image. Pushed the patch with this improvement. Thanks! Regards, -- Fujii Masao