Re: Error with index on unlogged table - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Error with index on unlogged table
Date
Msg-id CAB7nPqT=7R8F1W-ssAoYdkJkdL-QxGSTL7P3Sf4WVKOENyUBvw@mail.gmail.com
Whole thread Raw
In response to Re: Error with index on unlogged table  (Andres Freund <andres@anarazel.de>)
Responses Re: Error with index on unlogged table
Re: Error with index on unlogged table
List pgsql-hackers
On Fri, Dec 11, 2015 at 4:27 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2015-12-10 18:36:32 +0100, Andres Freund wrote:
>> On 2015-12-10 12:19:12 -0500, Robert Haas wrote:
>> > > The real problem there imo isn't that the copy_relation_data() doesn't
>> > > deal with 0 block tables, but that ATExecSetTableSpace() doesn't have a
>> > > unlogged table specific codepath like heap_create_with_catalog() has.
>> >
>> > It looks to me like somewhere we need to do log_smgrcreate(...,
>> > INIT_FORKNUM) in the unlogged table case.
>>
>> Yes.
>>
>> > RelationCreateStorage()
>> > skips this for the main forknum of an unlogged table, which seems OK,
>> > but there's nothing that even attempts it for the init fork, which
>> > does not seem OK.
>>
>> We unfortunately can't trivially delegate that work to
>> RelationCreateStorage(). E.g. heap_create() documents that only the main
>> fork is created :(
>>
>> > I guess that logic should happen in
>> > ATExecSetTableSpace just after smgrcreate(dstrel, forkNum, false).
>>
>> Looks like it's the easiest place.
>
>> > > A second problem is that the smgrimmedsync() in copy_relation_data()
>> > > isn't called for the init fork of unlogged relations, even if it needs
>> > > to.
>
> Here's a patch doing that. It's not yet fully polished, but I wanted to
> get it out, because I noticed one thing:
>
> In ATExecSetTableSpace(), for !main forks, we currently call
> smgrcreate(), but not log_smgrcreate(). Even for PERSISTENT
> relations. That seems a bit odd to me. It currently seems to be without
> further consequence because, if there's actual data in the fork, we'll
> just create the relation in _mdfd_getseg(); or we can cope with the
> relation not being there.  But to me that feels wrong.
>
> It seems better to do the log_smgrcreate() for RELPERSISTENCE_PERMANENT,
> not just INIT_FORKNUM. What do you guys think?

This fixes the problem in my environment.

+                       if (rel->rd_rel->relpersistence ==
RELPERSISTENCE_PERMANENT ||
+                               (rel->rd_rel->relpersistence ==
RELPERSISTENCE_UNLOGGED &&
+                                forkNum == INIT_FORKNUM))
+                               log_smgrcreate(&newrnode, forkNum);
There should be a XLogIsNeeded() check as well. Removing the check on
RELPERSISTENCE_UNLOGGED is fine as well... Not mandatory though :)

+        * The init fork for an unlogged relation in many respects has to be
+        * treated the same as normal relation, changes need to be WAL
logged and
+        * it needs to be synced to disk.
+        */
+       copying_initfork = relpersistence == RELPERSISTENCE_UNLOGGED &&
+               forkNum == INIT_FORKNUM;
Here as well just a check on INIT_FORKNUM would be fine.

>> It sounds worthwhile to check that other locations rewriting tables,
>> e.g. cluster/vacuum full/reindex are safe.
>
> Seems to be ok, on a first glance.

Yeah. REINDEX relies on index_build to recreate what it should... The
others are looking fine as well. I have tested it in case and the
files produced are consistent on standby and its master.
-- 
Michael



pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Inaccurate results from numeric ln(), log(), exp() and pow()
Next
From: Michael Paquier
Date:
Subject: Re: Making tab-complete.c easier to maintain