Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> My first thought is that the cycle_ctr just adds extra complexity. The
>> canceled-flag really is the key in Takahiro-san's patch, so we don't
>> need the cycle_ctr anymore.
>
> We don't have to have it in the sense of the code not working without it,
> but it probably pays for itself by eliminating useless fsyncs. The
> overhead for it in my proposed implementation is darn near zero in the
> non-error case. Also, Takahiro-san mentioned at one point that he was
> concerned to avoid useless fsyncs because of some property of the LDC
> patch --- I wasn't too clear on what, but maybe he can explain.
Ok. Perhaps we should not use the canceled-flag but just remove the
entry from pendingOpsTable like we used to when mdsync_in_progress isn't
set. We might otherwise accumulate a lot of canceled entries in the hash
table if checkpoint interval is long and relations are created and
dropped as part of normal operation.
I think there's one little bug in the patch:
1. AbsorbFsyncRequests is called. A FORGET message is received, and an
entry in the hash table is marked as canceled
2. Another relation with the same relfilenode is created. This can
happen after OID wrap-around
3. RememberFsyncRequest is called for the new relation. The old entry is
still in the hash table, marked with the canceled-flag, so it's not touched.
The fsync request for the new relation is masked by the old canceled
entry. The trivial fix is to always clear the flag on step 3:
--- md.c 2007-04-11 08:18:08.000000000 +0100
+++ md.c.new 2007-04-12 09:21:00.000000000 +0100
@@ -1161,9 +1161,9 @@
&found);
if (!found) /* new entry,
so initialize it */
{
- entry->canceled = false;
entry->cycle_ctr = mdsync_cycle_ctr;
}
+ entry->canceled = false;
/*
* NB: it's intentional that we don't change cycle_ctr
if the entry
* already exists. The fsync request must be treated
as old, even
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com