Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Date
Msg-id 54805306.4030100@vmware.com
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 12/04/2014 01:47 PM, Petr Jelinek wrote:
> On 04/12/14 12:26, Heikki Linnakangas wrote:
>> On 12/04/2014 01:16 PM, Petr Jelinek wrote:
>>> On 04/12/14 10:42, Heikki Linnakangas wrote:
>>>> On 12/03/2014 04:54 PM, Alvaro Herrera wrote:
>>>>> ir commit timestamp directly as they commit,
>>>>> or an external transaction c
>>>>
>>>> Sorry, I'm late to the party, but here's some random comments on this
>>>> after a quick review:
>>>>
>>>> * The whole concept of a node ID seems to be undocumented, and unused.
>>>> No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
>>>> dead code too, when a non-default node_id is not set. Please remove, or
>>>> explain how all that's supposed to work.
>>>
>>> That's API meant to be used by extensions, maybe it would be preferable
>>> if there was SQL API exposed also?
>>>
>>> (It might also make more sense once replication identifiers are
>>> submitted.)
>>
>> Maybe. Hard to tell without any documentation or comments on how it's
>> supposed to work. In unacceptable in its current state.
>>
>> I would prefer to remove it now, and add it back later together with the
>> code that actually uses it. I don't like having unused internal
>> functions and APIs sitting the source tree in the hope that someone will
>> find them useful in the future. It's more likely that it will bitrot, or
>> not actually work as intended, when someone later tries to use it.
>
> The thing is I already have extension for 9.4 where I would use this API
> for conflict detection if it was available so it's not about hoping for
> somebody to find it useful. There was discussion about this during the
> review process because the objection was raised already then.

Yeah, it was raised. I don't think it was really addressed. There was 
lengthy discussion on whether to include LSN, node id, and/or some other 
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code comments.

The node ID sounds suspiciously like the replication identifiers that 
have been proposed a couple of times. I don't remember if I like 
replication identifiers or not, but I'd rather discuss that issue 
explicitly and separately. I don't want the concept of a 
replication/node ID to fly under the radar like this.

>> What would the SQL API look like, and what would it be used for?
>
> It would probably mirror the C one, from my POV it's not needed but
> maybe some replication solution would prefer to use this without having
> to write C components...

I can't comment on that, because without any documentation, I don't even 
know what the C API is.

BTW, why is it OK that the node-ID of a commit is logged as a separate 
WAL record, after the commit record? That means that it's possible that 
a transaction commits, but you crash just before writing the SETTS 
record, so that after replay the transaction appears committed but with 
the default node ID.

(Maybe that's OK given the intended use case, but it's hard to tell 
without any documentation. See a pattern here? ;-) )

- Heikki



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps
Next
From: Heikki Linnakangas
Date:
Subject: Re: SSL regression test suite