Re: FSM Corruption (was: Could not read block at end of the relation) - Mailing list pgsql-bugs
From | Ronan Dunklau |
---|---|
Subject | Re: FSM Corruption (was: Could not read block at end of the relation) |
Date | |
Msg-id | 5784272.DvuYhMxLoT@aivenlaptop Whole thread Raw |
In response to | Re: FSM Corruption (was: Could not read block at end of the relation) (Noah Misch <noah@leadboat.com>) |
Responses |
Re: FSM Corruption (was: Could not read block at end of the relation)
|
List | pgsql-bugs |
Le mercredi 13 mars 2024, 17:55:23 CET Noah Misch a écrit : > On Wed, Mar 13, 2024 at 10:37:21AM +0100, Ronan Dunklau wrote: > > Le mardi 12 mars 2024, 23:08:07 CET Noah Misch a écrit : > > > > I would argue this is fine, as corruptions don't happen often, and FSM > > > > is > > > > not an exact thing anyway. If we record a block as full, it will just > > > > be > > > > unused until the next vacuum adds it back to the FSM with a correct > > > > value. > > > > > > If this happened only under FSM corruption, it would be fine. I was > > > thinking about normal extension, with an event sequence like this: > > > > > > - RelationExtensionLockWaiterCount() returns 10. > > > - Backend B1 extends by 10 pages. > > > - B1 records 9 pages in the FSM. > > > - B2, B3, ... B11 wake up and fetch from the FSM. In each of those > > > backends, fsm_does_block_exist() returns false, because the cached > > > relation > > > size is stale. Those pages remain unused until VACUUM re-records them > > > in > > > the FSM. > > > > > > PostgreSQL added the multiple-block extension logic (commit 719c84c) > > > from > > > hard-won experience bottlenecking there. If fixing $SUBJECT will lose > > > 1% of that change's benefit, that's probably okay. If it loses 20%, we > > > should work to do better. While we could rerun the 2016 benchmark to > > > see how the patch regresses it, there might be a simple fix. > > > fsm_does_block_exist()> > > > > could skip the cache when blknumber>=cached, like this: > > > return((BlockNumberIsValid(smgr- >smgr_cached_nblocks[MAIN_FORKNUM]) > > > > && > > > > > blknumber < smgr- > > > > > >smgr_cached_nblocks[MAIN_FORKNUM]) || > > > > > > blknumber < RelationGetNumberOfBlocks(rel)); > > > > > > How do you see it? That still leaves us with an avoidable lseek in B2, > > > B3, > > > ... B11, but that feels tolerable. I don't know how to do better > > > without > > > switching to the ReadBufferMode approach. > > > > That makes sense. I'm a bit worried about triggering additional lseeks > > when we in fact have other free pages in the map, that would pass the > > test. I'm not sure how relevant it is given the way we search the FSM > > with fp_next_slot though... > > That's a reasonable thing to worry about. We could do wrong by trying too > hard to use an FSM slot, and we could do wrong by not trying hard enough. > > > To address that, I've given a bit of thought about enabling / disabling > > the > > auto-repair behaviour with a flag in GetPageWithFreeSpace to distinguish > > the cases where we know we have a somewhat up-to-date value compared to > > the case where we don't (as in, for heap, try without repair, then get an > > uptodate value to try the last block, and if we need to look at the FSM > > again then ask for it to be repaired) but it brings way too much > > complexity and would need careful thought for each AM. > > > > So please find attached a patch with the change you propose. > > > > Do you have a link to the benchmark you mention though to evaluate the > > patch against it ? > > Here is one from the thread that created commit 719c84c: > https://www.postgresql.org/message-id/flat/CA%2BTgmob7xED4AhoqLspSOF0wCMYEom > gHfuVdzNJnwWVoE_c60g%40mail.gmail.com > > There may be other benchmarks earlier in that thread. Thank you. I tried to run that benchmark, with 4 clients all copying 10M pgbench_accounts row concurrently. With and without the patch, I ran the benchmark 5 times, and took the average. master: 15.05s patched: 15.24s (+1.25%) If I remove the best and worst run for each of those, the difference falls at +0.7%.
pgsql-bugs by date: