Dear Michael, Amit,
>
> Amit, this has been applied as of 861f86beea1c, and I got pinged about
> the fact this triggers inconsistencies because we always set the LSN
> of the write buffer (wbuf in _hash_freeovflpage) but
> XLogRegisterBuffer() would *not* be called when the two following
> conditions happen:
> - When xlrec.ntups <= 0.
> - When !xlrec.is_prim_bucket_same_wrt && !xlrec.is_prev_bucket_same_wrt
>
> And it seems to me that there is still a bug here: there should be no
> point in setting the LSN on the write buffer if we don't register it
> in WAL at all, no?
Thanks for pointing out, I agreed your saying. PSA the patch for diagnosing the
issue.
This patch can avoid the inconsistency due to the LSN setting and output a debug
LOG when we met such a case. I executed hash_index.sql and confirmed the log was
output [1]. This meant that current test has already had a workload which meets below
conditions:
- the overflow page has no tuples (xlrec.ntups is 0),
- to-be-written page - wbuf - is not the primary (xlrec.is_prim_bucket_same_wrt
is false), and
- to-be-written buffer is not next to the overflow page
(xlrec.is_prev_bucket_same_wrt is false)
So, I think my patch (after removing elog(...) part) can fix the issue. Thought?
[1]:
```
LOG: XXX: is_wbuf_registered: false
CONTEXT: while vacuuming index "hash_cleanup_index" of relation "public.hash_cleanup_heap"
STATEMENT: VACUUM hash_cleanup_heap;
```
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/