Re: Fix mdsync never-ending loop problem - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: Fix mdsync never-ending loop problem
Date
Msg-id 46152A60.1070404@enterprisedb.com
Whole thread Raw
In response to Re: Fix mdsync never-ending loop problem  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fix mdsync never-ending loop problem
List pgsql-patches
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Tom Lane wrote:
>>> But I dislike copying the table entries anyway, see comment on -hackers.
>
>> Frankly the cycle id idea sounds more ugly and fragile to me. You'll
>> need to do multiple scans of the hash table that way, starting from top
>> every time you call AbsorbFsyncRequests (like we do know).
>
> How so?  You just ignore entries whose cycleid is too large.  You'd have
> to be careful about wraparound in the comparisons, but that's not hard
> to deal with.  Also, AFAICS you still have the retry problem (and an
> even bigger memory leak problem) with this coding --- the "to-do list"
> doesn't eliminate the issue of correct handling of a failure.

You have to start the hash_seq_search from scratch after each call to
AbsorbFsyncRequests because it can remove entries, including the one the
scan is stopped on.

I think the failure handling is correct in the "to-do list" approach,
when an entry is read from the list, it's checked that the entry hasn't
been removed from the hash table. Actually there was a bug in the
original LDC patch in the failure handling: it replaced the per-entry
failures-counter with a local retry_counter variable, but it wasn't
cleared after a successful write which would lead to bogus ERRORs when
multiple relations are dropped during the mdsync. I kept the original
per-entry counter, though the local variable approach could be made to work.

The memory leak obviously needs to be fixed, but that's just a matter of
adding a pfree.

>> Ok, will do that. Or would you like to just take over from here?
>
> No, I'm up to my ears in varlena.  You're the one in a position to test
> this, anyway.

Ok.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fix mdsync never-ending loop problem
Next
From: Tom Lane
Date:
Subject: Re: Fix mdsync never-ending loop problem