On Mon, Mar 2, 2020 at 3:19 PM Victor Yegorov <vyegorov@gmail.com> wrote:
> This made me look at the stats, as this is a test copy of the DB upgraded to 12.2 recently. Per pg_stat_user_tables,
> table had never been vacuumed since upgrade, only analyzed.
> I've run vacuum manually and this made the issue go away.
>
> It is beyond my skills to try to find the real place for this issue now. I think we might be hitting some
uninitializedvalue somewhere, perhaps?..
The issue is that bt_metap() was declared using the wrong data types
-- simple as that. An int4 cannot represent 2180413846 -- only a
uint32 (or a TransactionId) can. Technically this is not a recent
issue, since we got btm_root wrong right from the start. However, it
seems more likely that the relatively recently added oldest_xact field
will cause problems in practice. The fact that a manual VACUUM made
that go away for you (at least temporarily) is not surprising.
The declaration itself is wrong, so it is the declaration itself that
we must fix. I can't really see a way of fixing it without introducing
a new version of contrib/pageinspect. :-(
I propose the attached fix for the master branch only. This is a draft
patch. The patch changes the data types to more appropriate, wider
integer types. It follows the convention of using int8/bigint where
the natural data type to use at the C code level is actually uint32 --
pg_stat_statements does this with queryId (actually, we switched over
to 64-bit hashes some time later, but it worked that way before commit
cff440d3686).
The patch takes another idea from pg_stat_statements: Using a
tupledesc's natts field to determine the extension version in use, to
maintain compatibility when there are breaking changes to a function's
definition. I simply error out when bt_metap() is called using the
definition from an older version of the extension, unlike
pg_stat_statements. It isn't worth going to the trouble of preserving
a set of behaviors that are more or less broken anyway. Better to
error out in an obvious way. Especially given that contrib/pageinspect
is fundamentally a superuser-only extension, with a relatively small
user base.
Theoretically, I should also change bt_page_items() (and several
functions) to make its blkno In parameter of type int8/bigint (instead
of int4/integer). I haven't bothered with that, though.
--
Peter Geoghegan