Hi,
On 2020-10-28 19:09:14 -0700, Andres Freund wrote:
> On 2020-10-28 18:13:44 -0700, Andres Freund wrote:
> > Just pushed this. Let's see what the BF says...
>
> It says that apparently something is unstable about my new test. It
> first passed on a few animals, but then failed a lot in a row. Looking.
The differentiating factor is force_parallel_mode=regress.
Ugh, this is nasty: The problem is that we can end up computing the
horizons the first time before MyDatabaseId is even set. Which leads us
to compute a too aggressive horizon for plain tables, because we skip
over them, as MyDatabaseId still is InvalidOid:
/*
* Normally queries in other databases are ignored for anything but
* the shared horizon. But in recovery we cannot compute an accurate
* per-database horizon as all xids are managed via the
* KnownAssignedXids machinery.
*/
if (in_recovery ||
proc->databaseId == MyDatabaseId ||
proc->databaseId == 0) /* always include WalSender */
h->data_oldest_nonremovable =
TransactionIdOlder(h->data_oldest_nonremovable, xmin);
That then subsequently leads us consider a row fully dead in
heap_hot_search_buffers(). Triggering the killtuples logic. Causing the
test to fail.
With force_parallel_mode=regress we constantly start parallel workers,
which makes it much more likely that this case is hit.
It's trivial to fix luckily...
Greetings,
Andres Freund