Hi hackers,
Recently, we encounter an issue that if the database grant too many roles that `datacl` toasted, vacuum freeze will fail with error "wrong tuple length".
To reproduce the issue, please follow below steps:
CREATE DATABASE vacuum_freeze_test;
-- create helper function
create or replace function toast_pg_database_datacl() returns text as $body$
declare
mycounter int;
begin
for mycounter in select i from generate_series(1, 2800) i loop
execute 'create role aaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb' || mycounter;
execute 'grant ALL on database vacuum_freeze_test to aaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb' || mycounter;
end loop;
return 'ok';
end;
$body$ language plpgsql volatile strict;
-- create roles and grant on the database
select toast_pg_database_datacl();
-- connect to the database
\c vacuum_freeze_test
-- chech the size of column datacl
select datname, pg_column_size(datacl) as datacl_size, age(datfrozenxid) from pg_database where datname='vacuum_freeze_test';
-- execute vacuum freeze and it should raise "wrong tuple length"
vacuum freeze;
The root cause is that vac_update_datfrozenxid fetch a copy of pg_database tuple from system cache.
But the system cache flatten any toast attributes, which cause the length chech failed in heap_inplace_update.
A path is attached co auther by Ashwin Agrawal, the solution is to fetch the pg_database tuple from disk instead of system cache if needed.