Thread: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
From
Ayush Tiwari
Date:
Hi hackers,
We encountered an issue lately, that if the database grants too many roles `datacl` is toasted, following which, the drop database command will fail with error "wrong tuple length".
To reproduce the issue, please follow below steps:
CREATE DATABASE test;
-- create helper function
CREATE OR REPLACE FUNCTION data_tuple() returns text as $body$
declare
mycounter int;
begin
for mycounter in select i from generate_series(1,2000) i loop
execute 'CREATE ROLE aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter;
execute 'GRANT ALL ON DATABASE test to aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter;
end loop;
return 'ok';
end;
$body$ language plpgsql volatile strict;
-- create roles and grant on the database.
SELECT data_tuple();
-- drop database command, this will result in "wrong tuple length" error.
DROP DATABASE test;
The root cause of this behaviour is that the HeapTuple in dropdb function fetches a copy of pg_database tuple from system cache.
But the system cache flattens any toast attributes, which cause the length check to fail in heap_inplace_update.
A patch for this issue is attached to the mail, the solution is to change the logic to fetch the tuple by directly scanning pg_database rather than using the catcache.
Regards,
Ayush
We encountered an issue lately, that if the database grants too many roles `datacl` is toasted, following which, the drop database command will fail with error "wrong tuple length".
To reproduce the issue, please follow below steps:
CREATE DATABASE test;
-- create helper function
CREATE OR REPLACE FUNCTION data_tuple() returns text as $body$
declare
mycounter int;
begin
for mycounter in select i from generate_series(1,2000) i loop
execute 'CREATE ROLE aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter;
execute 'GRANT ALL ON DATABASE test to aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter;
end loop;
return 'ok';
end;
$body$ language plpgsql volatile strict;
-- create roles and grant on the database.
SELECT data_tuple();
-- drop database command, this will result in "wrong tuple length" error.
DROP DATABASE test;
The root cause of this behaviour is that the HeapTuple in dropdb function fetches a copy of pg_database tuple from system cache.
But the system cache flattens any toast attributes, which cause the length check to fail in heap_inplace_update.
A patch for this issue is attached to the mail, the solution is to change the logic to fetch the tuple by directly scanning pg_database rather than using the catcache.
Regards,
Ayush
Attachment
Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
From
Tomas Vondra
Date:
Hi Ayush, On 8/13/24 07:37, Ayush Tiwari wrote: > Hi hackers, > > We encountered an issue lately, that if the database grants too many > roles `datacl` is toasted, following which, the drop database command > will fail with error "wrong tuple length". > > To reproduce the issue, please follow below steps: > > CREATE DATABASE test; > > -- create helper function > CREATE OR REPLACE FUNCTION data_tuple() returns text as $body$ > declare > mycounter int; > begin > for mycounter in select i from generate_series(1,2000) i loop > execute 'CREATE > ROLE aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter; > execute 'GRANT ALL ON DATABASE test to > aaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbb ' || mycounter; > end loop; > return 'ok'; > end; > $body$ language plpgsql volatile strict; > > -- create roles and grant on the database. > SELECT data_tuple(); > > -- drop database command, this will result in "wrong tuple length" error. > DROP DATABASE test; > > The root cause of this behaviour is that the HeapTuple in dropdb > function fetches a copy of pg_database tuple from system cache. > But the system cache flattens any toast attributes, which cause the > length check to fail in heap_inplace_update. > > A patch for this issue is attached to the mail, the solution is to > change the logic to fetch the tuple by directly scanning pg_database > rather than using the catcache. > Thanks for the report. I can reproduce the issue following your instructions, and the fix seems reasonable ... But there's also one thing I don't quite understand. I did look for other places that might have a similar issue, that is places that 1) lookup tuple using SearchSysCacheCopy1 2) call on the tuple heap_inplace_update And I found about four places doing that: - index_update_stats (src/backend/catalog/index.c) - create_toast_table (src/backend/catalog/toasting.c) - vac_update_relstats / vac_update_datfrozenxid (commands/vacuum.c) But I haven't managed to trigger the same kind of failure for any of those places, despite trying. AFAIK that's because those places update pg_class, and that doesn't have TOAST, so the tuple length can't change. So this fix seems reasonable. -- Tomas Vondra
Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
From
Tomas Vondra
Date:
On 8/16/24 13:26, Tomas Vondra wrote: > Hi Ayush, > > ... > > So this fix seems reasonable. > I've pushed this to all affected branches, except for 11 which is EOL. I thought about adding a test, but I couldn't think of a TAP test where this would really fit, and it didn't seem very practical to have a test creating hundreds of roles. So I abandoned the idea. Thanks for the report and the fix! -- Tomas Vondra
Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
From
Tomas Vondra
Date:
On 8/19/24 11:01, Yugo Nagata wrote: > On Mon, 19 Aug 2024 00:35:39 +0200 > Tomas Vondra <tomas@vondra.me> wrote: > >> On 8/16/24 13:26, Tomas Vondra wrote: >>> Hi Ayush, >>> >>> ... >>> >>> So this fix seems reasonable. >>> >> >> I've pushed this to all affected branches, except for 11 which is EOL. >> >> I thought about adding a test, but I couldn't think of a TAP test where >> this would really fit, and it didn't seem very practical to have a test >> creating hundreds of roles. So I abandoned the idea. > > I tried to add Assert in heap_inplace_update to prevent possible similar > failures, but I gave up because I could not find a good way to determine if > a tuple is detoasted of not. > Right, not sure there's a good way to check for that. > By the way, I found a comment in vac_update_datfrozenxid() and EventTriggerOnLogin() > that explains why we could not use tuples from the syscache for heap_inplace_update. > I think it is better ad d the same comment in dropdb(). I attached a trivial patch for it. > Agreed. That seems like a nice improvement to the comment. regards -- Tomas Vondra
Re: Drop database command will raise "wrong tuple length" if pg_database tuple contains toast attribute.
From
Tomas Vondra
Date:
On 8/19/24 12:16, Tomas Vondra wrote: > On 8/19/24 11:01, Yugo Nagata wrote: > > ... > >> By the way, I found a comment in vac_update_datfrozenxid() and EventTriggerOnLogin() >> that explains why we could not use tuples from the syscache for heap_inplace_update. >> I think it is better ad d the same comment in dropdb(). I attached a trivial patch for it. >> > > Agreed. That seems like a nice improvement to the comment. > Done, thanks for the suggestion / patch. regards -- Tomas Vondra