Re: [PERFORM] DELETE vs TRUNCATE explanation - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PERFORM] DELETE vs TRUNCATE explanation
Date
Msg-id 20408.1342706966@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PERFORM] DELETE vs TRUNCATE explanation  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [PERFORM] DELETE vs TRUNCATE explanation  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> Seems a bit complex, but it might be worth it.  Keep in mind that I
> eventually want to be able to make an unlogged table logged or a visca
> versa, which will probably entail unlinking just the init fork (for
> the logged -> unlogged direction).

Well, as far as that goes, I don't see a reason why you couldn't unlink
the init fork immediately on commit.  The checkpointer should not have
to be involved at all --- there's no reason to send it a FORGET FSYNC
request either, because there shouldn't be any outstanding writes
against an init fork, no?

But having said that, this does serve as an example that we might
someday want the flexibility to kill individual forks.  I was
intending to kill smgrdounlinkfork altogether, but I'll refrain.

> I think this is just over-engineered.  The originally complained-of
> problem was all about the inefficiency of manipulating the
> checkpointer's backend-private data structures, right?  I don't see
> any particular need to mess with the shared memory data structures at
> all.  If you wanted to add some de-duping logic to retail fsync
> requests, you could probably accomplish that more cheaply by having
> each such request look at the last half-dozen or so items in the queue
> and skip inserting the new request if any of them match the new
> request.  But I think that'd probably be a net loss, because it would
> mean holding the lock for longer.

What about checking just the immediately previous entry?  This would
at least fix the problem for bulk-load situations, and the cost ought
to be about negligible compared to acquiring the LWLock.

I have also been wondering about de-duping on the backend side, but
the problem is that if a backend remembers its last few requests,
it doesn't know when that cache has to be cleared because of a new
checkpoint cycle starting.  We could advertise the current cycle
number in shared memory, but you'd still need to take a lock to
read it.  (If we had memory fence primitives it could be a bit
cheaper, but I dunno how much.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: isolation check takes a long time
Next
From: Tom Lane
Date:
Subject: Re: bgwriter, regression tests, and default shared_buffers settings