On 3/27/15 5:15 AM, Vladimir Borodin wrote:
> Hi all.
>
> I have described [0] a problem with delaying replicas after vacuuming a
> relation with big btree index. It stucks in replaying WAL record of
> type XLOG_BTREE_VACUUM like that (with lastBlockVacuumed 0):
>
> rmgr: Btree len (rec/tot): 20/ 52, tx: 0, lsn:
> 4115/56126DC0, prev 4115/56126D90, bkp: 0000, desc: vacuum: rel
> 1663/16420/16796; blk 31222118, lastBlockVacuumed 0
>
> Master writes this record to xlog in btvacuumscan [1] function after
> vacuuming of all index pages. And in case of no pages with deleted items
> xlog record would contain lastBlockVacuumed 0.
>
> In btree_xlog_vacuum [2] replica reads all blocks from lastBlockVacuumed
> to last block of the index while applying this record because there is
> no api in the buffer manager to understand if the page is unpinned.
>
> So if the index is quite big (200+ GB in described case) it takes much
> time to do it. So the questions are:
>
> 1. Aren’t there still any api in buffer manager to understand that the
> page is not in shared_buffers without reading it?
I don't know offhand, but since XLogReadBufferExtended already has a
mode argument it wouldn't be too hard to add it there.
> 2. Is it possible not to write to xlog record with lastBlockVacuumed 0
> in some cases? For example, in case of not deleting any pages.
Possibly, but that's much higher risk. Without studying it, if we wanted
to mess around with that it might actually make more sense to XLOG a set
of blkno's that got vacuumed, but I suspect that wouldn't be a win.
> Or maybe there are some better ways of improving this situation?
>
> [0]
> http://www.postgresql.org/message-id/FE82A9A7-0D52-41B5-A9ED-967F6927CB8A@simply.name
> [1]
>
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtree.c;hb=refs/heads/REL9_4_STABLE#l813
> [2]
>
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/nbtree/nbtxlog.c;hb=refs/heads/REL9_4_STABLE#l482
>
> --
> May the force be with you…
> https://simply.name
>
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com