Re: FSM Corruption (was: Could not read block at end of the relation) - Mailing list pgsql-bugs

From Ronan Dunklau
Subject Re: FSM Corruption (was: Could not read block at end of the relation)
Date
Msg-id 1958756.PYKUYFuaPT@aivenlaptop
Whole thread Raw
In response to Re: FSM Corruption (was: Could not read block at end of the relation)  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: FSM Corruption (was: Could not read block at end of the relation)  (Patrick Stählin <patrick.staehlin@aiven.io>)
Re: FSM Corruption (was: Could not read block at end of the relation)  (Noah Misch <noah@leadboat.com>)
List pgsql-bugs
Le mardi 5 mars 2024, 00:05:03 CET Noah Misch a écrit :
> I would guess this one is more risky from a performance perspective, since
> we'd be adding to a hotter path under RelationGetBufferForTuple().  Still,
> it's likely fine.

I ended up implementing this in the attached patch. The idea is that we detect
if the FSM returns a page past the end of the relation, and ignore it.
In that case we will fallback through the extension mechanism.

For the corrupted-FSM case it is not great performance wise, as we will extend
the relation in small steps every time we find a non existing block in the FSM,
until the actual relation size matches what is recorded in the FSM. But since
those seldom happen, I figured it was better to keep the code really simple for
a bugfix.

I wanted to test the impact in terms of performance, and I thought about the
worst possible case for this.

Then, run a pgbench doing insertions in the table. With the attached patch the
worst case I could come up with is:
  - remember which page we last inserted into
  - notice we don't have enough space
  - ask the FSM for a block
  - now have to compare that to the actual relation size

So I came up with the following initialization steps:

 - create a table with vacuum_truncate = off, with a tuple size big enough that
it's impossible to fit two tuples on the same page
 - insert lots of tuple in it until it reaches a decent size
 - delete them all
 - vacuum
 - all of this fitting in shared_buffers

As in:

CREATE TABLE test_perf (c1 char(5000));
ALTER TABLE test_perf ALTER c1 SET STORAGE PLAIN;
ALTER TABLE test_perf SET (VACUUM_TRUNCATE = off);
INSERT INTO test_perf (c1) SELECT 'c' FROM generate_series(1, 1000000);
DELETE FROM test_perf;
VACUUM test_perf;

Then I ran pgbench with a single client, with a script only inserting the same
value over and over again, for 1000000 transactions (initial table size).

I noticed no difference running with or without the patch, but maybe someone
else can try to run that or find another adversarial case ?

Best regards,

--
Ronan Dunklau
Attachment

pgsql-bugs by date:

Previous
From: Changqing Li
Date:
Subject: A build failure since only include header "postgresql/server/port.h"
Next
From: Devrim Gündüz
Date:
Subject: Re: Issue with PostgreSQL 11 RPM Package Availability