Re: [HACKERS] hash index on unlogged tables doesn't behave as expected - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] hash index on unlogged tables doesn't behave as expected
Date
Msg-id CAB7nPqR-2GMDeEEVa7Sh5xNsOZ_SjnsWkxYLfwqBVhqnLO6v=A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] hash index on unlogged tables doesn't behave as expected  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] hash index on unlogged tables doesn't behave as expected
List pgsql-hackers
On Mon, Jul 3, 2017 at 7:40 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> There is already such a test, see create_index.sql.
> CREATE UNLOGGED TABLE unlogged_hash_table (id int4);
> CREATE INDEX unlogged_hash_index ON unlogged_hash_table USING hash (id
> int4_ops);
>
> Do you have something else in mind?
>
> I think the problem mentioned in this thread can be caught only if we
> promote the standby and restart it.  We might be able to write it
> using TAP framework, but I think such a test should exist for other
> index types as well or in general for unlogged tables.  I am not
> opposed to writing such a test if you and or others feel so.

There have been many requests for something like that. This overlaps a
lot with the existing make check though, so a buildfarm module using
wal_consistency_checking may be a better idea than something in core.

>> Apart from that i have tested the patch and the patch seems to be working
>> fine.
>
> Thanks.

It seems to me that this qualifies as an open item for 10. WAL-logging
is new for hash tables. Amit, could you add one?

+   use_wal = RelationNeedsWAL(rel) ||
+             (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
+              forkNum == INIT_FORKNUM);
In access AMs, empty() routines are always only called for unlogged
relations, so you could relax that to check for INIT_FORKNUM only.

Looking at the patch, I think that you are right to do the logging
directly in _hash_init() and not separately in hashbuildempty().
Compared to other relations the init data is more dynamic.

+       /*
+        * Force the on-disk state of init forks to always be in sync with the
+        * state in shared buffers.  See XLogReadBufferForRedoExtended.
+        */
+       XLogRecGetBlockTag(record, 0, NULL, &forknum, NULL);
+       if (forknum == INIT_FORKNUM)
+               FlushOneBuffer(metabuf);
I find those forced syncs a bit surprising. The buffer is already
marked as dirty, so even if there is a concurrent checkpoint they
would be flushed. If recovery comes again here, then they would just
be replayed. I am unclear why XLogReadBufferForRedoExtended is not
enough to replay a FPW of that.
-- 
Michael



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: [HACKERS] Multi column range partition table
Next
From: Etsuro Fujita
Date:
Subject: Re: [HACKERS] UPDATE of partition key