Hi all,
PostgreSQL currently maintains several data structures in the SLRU cache. The current SLRU pages do not have any header, so it is impossible to checksum a page and verify its integrity. It is very difficult to debug issues caused by corrupted SLRU pages. Also, without a page header, page LSN is tracked in an ad-hoc fashion using LSN groups, which requires additional data structure in the shared memory. At eBay, we are building on the patch shared by Rishu Bagga in [1], which adds the standard PageHeaderData to each SLRU page. We believe that adding the standard page header to each SLRU page is the correct approach for the long run. It adds a checksum to each SLRU page, tracks page LSN as if it is a standard page and eases future page enhancements.
The enclosed patch changes the address calculation logic for all 7 SLRUs in the following 6 files:
src/backend/access/transam/clog.c
src/backend/access/transam/commit_ts.c
src/backend/access/transam/multixact.c
src/backend/access/transam/subtrans.c
src/backend/commands/async.c
src/backend/storage/lmgr/predicate.c
The patch enables page checksum with changes to the following 2 files:
src/backend/access/transam/slru.c
src/bin/pg_checksums/pg_checksums.c
The patch removes the group LSNs defined for each SLRU cache. See changes to:
src/include/access/slru.h
The patch adds a few helper macros in the following files:
src/backend/storage/page/bufpage.c
src/include/storage/bufpage.h
The patch updates some test cases:
src/bin/pg_resetwal/t/001_basic.pl
src/test/modules/test_slru/test_slru.c
I am still working on patching the pg_upgrade. Just love to hear your thoughts on the idea and the current patch.
Discussed with: Anton Shyrabokau and Shawn Debnath
[1] https://www.postgresql.org/message-id/flat/EFAAC0BE-27E9-4186-B925-79B7C696D5AC%40amazon.com
Regards,
Yong