Thread: Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)

Hi

Since generating transient XIDs (named ResourceOwnerIDs in my patch, since
their lifetime is coupled to the lifetime of a transaction's toplevel
resource owner) seems to be to way to go for lazx xid assignment, I need
to find a way to represent them in the pg_locks view.

ResourceOwnerIds are a structure composed of two uint32s, a processID
(could be the PID of the backend, but to make sure that it isn't reused
too quickly, it's actually a synthentic ID generated at backend start),
and localTransactionId which is just incremented whenever a new transaction
is started in a backend. This design was the result of my discussion with
Tom - it's main advantage is that it needs no lock to generate a new
ResourceOwnerId.

I see 3 possibilities to represent this in system views
A) Make ResourceOwnerID a full-blown type, with in and out methods, very   similar to tids.
"processId/localTransactionId"would be a natural   string representation.
 
B) Just convert the ResourceOwnerId into a string in pg_lock_status.   Looks quite similar to (A) from a user's point
ofview, but the   implementation is much shorter.
 
C) Combine the two uint32 fields of ResourceOwnerId into a int8.   Might be more efficient than (B). The main
disadvantageis that   some ResourceOwnerIds will be represented by *negative* integers,   which is pretty ugly.
 
D) Just make them two int4 fields. This has the same "negativity"   issue that (C) has, and might cause confusion if
usersdon't   read the docs carefully.
 

I'm leaning towards (A), but it adds a lot new code (although most if
it would be copied nearly 1-to-1 from tid.c) for maybe too little gain.

If (A) is deemed not appropriate, doing (C) and restricting processIds
to <= 0x80000000 might be an option.

greetings, Florian Pflug



"Florian G. Pflug" <fgp@phlo.org> writes:
> Since generating transient XIDs (named ResourceOwnerIDs in my patch, since
> their lifetime is coupled to the lifetime of a transaction's toplevel
> resource owner) seems to be to way to go for lazx xid assignment, I need
> to find a way to represent them in the pg_locks view.

This is going very far towards gilding the lily.  Try to avoid loading
the patch down with a new datatype.

I'm inclined to think that it'd be sufficient to show the high half of
the ID (that is, the session number) in pg_locks, because there will
never be cases where there are concurrently existing locks on different
localTransactionIds.  This could probably be displayed in the
transactionID columns, which would mean we're abusing the user-visible
xid datatype, but I don't see much harm in it.
        regards, tom lane


Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Since generating transient XIDs (named ResourceOwnerIDs in my patch, since
>> their lifetime is coupled to the lifetime of a transaction's toplevel
>> resource owner) seems to be to way to go for lazx xid assignment, I need
>> to find a way to represent them in the pg_locks view.
> 
> This is going very far towards gilding the lily.  Try to avoid loading
> the patch down with a new datatype.
> 
> I'm inclined to think that it'd be sufficient to show the high half of
> the ID (that is, the session number) in pg_locks, because there will
> never be cases where there are concurrently existing locks on different
> localTransactionIds.

Hm.. I'm not too happy with that. I you for example join pg_locks to
pg_stat_activity (which would need to show the RID too), than you
*might* get a bogus result if a transaction ends and a new one starts
on the same backend between the time pg_lock_status is called, and the time
the proc array is read.

> This could probably be displayed in the
> transactionID columns, which would mean we're abusing the user-visible
> xid datatype, but I don't see much harm in it.

I'm even more unhappy with that, because the session id of a RID might
coincide with a currently in-use XID.

What about the following.
.) Remove the right-hand side XID from pg_locks (The one holder or waiter   of the lock). It seems to make more sense
tostore a RID here, and let   the user fetch the XID via a join to pg_stat_activity. We could also show   both the XID
(ifset) and the RID, but that might lead people to believe   that their old views or scripts on top of pg_locks still
workcorrectly   when they actually do not.
 
.) On the left-hand side (The locked object), add a RID column of type int8,   containing (2^32)*sessionID +
localTransactionId.
.) To prevent the int8 from being negative, we limit the sessionID to 31 bytes -   which is still more then enough.

greetings, Florian Pflug



"Florian G. Pflug" <fgp@phlo.org> writes:
> What about the following.
> .) Remove the right-hand side XID from pg_locks (The one holder or waiter
>     of the lock). It seems to make more sense to store a RID here,

Yeah, we have to do that since there might not *be* an XID holding the
lock.  But I still think the session ID would be sufficient here.
(Perhaps we don't need the PID either, although then we'd need to change
pg_stat_activity to provide session id as a join key...)

> .) On the left-hand side (The locked object), add a RID column of type int8,
>     containing (2^32)*sessionID + localTransactionId.

I'm a bit uncomfortable with that since it renders the view completely
useless if you don't have a working int8 type.

> .) To prevent the int8 from being negative, we limit the sessionID to 31 bytes -
>     which is still more then enough.

Hmm ... actually, that just begs the question of how many bits we need
at all.  Could we display, say, 24 bits of sessionID and 8 bits of
localXID merged into a column of nominal XID type?  There's a
theoretical risk of false join matches but it seems pretty theoretical,
and a chance match would not break any system functionality anyway since
all internal operations would be working with full-width counters.
        regards, tom lane


Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> What about the following.
>> .) Remove the right-hand side XID from pg_locks (The one holder or waiter
>>     of the lock). It seems to make more sense to store a RID here,
> 
> Yeah, we have to do that since there might not *be* an XID holding the
> lock.  But I still think the session ID would be sufficient here.
> (Perhaps we don't need the PID either, although then we'd need to change
> pg_stat_activity to provide session id as a join key...)

Yeah, the PID seems to be redundant if we add the RID. But OTOH it does no
harm to leave it there - other than the xid, which gives a false sense
of security. Don't know what our policy for system-catalog
backwards-compatibility is, though...

>> .) On the left-hand side (The locked object), add a RID column of type int8,
>>     containing (2^32)*sessionID + localTransactionId.
> 
> I'm a bit uncomfortable with that since it renders the view completely
> useless if you don't have a working int8 type.

Yeah, I only now realized that int8 really *is* busted if INT64_IS_BUSTED is
defined. I always thought that there is some kind of emulation code in place,
but apparently there isn't. :-( So there goes this idea....

>> .) To prevent the int8 from being negative, we limit the sessionID to 31 bytes -
>>     which is still more then enough.
> 
> Hmm ... actually, that just begs the question of how many bits we need
> at all.  Could we display, say, 24 bits of sessionID and 8 bits of
> localXID merged into a column of nominal XID type?  There's a
> theoretical risk of false join matches but it seems pretty theoretical,
> and a chance match would not break any system functionality anyway since
> all internal operations would be working with full-width counters.

Hm.. If we go down that router, we could just calculate some hash value
from sessionID and localTransactionId that fits into 31 bits, and use
an int4. Or 32 bits, and use xid.

I am, however a bit reluctant to do this. I'd really hate to spend a few hours
tracking down some locking problem, only to find out that I'd been looking at
the wrong place because of some id aliasing... I know it's only a 1-in-4-billion
chance, but still.... it gives me an uneasy feeling.

What about a string representation? Something like sessionId/localTransactionId?
Should we ever decide that indeed this *should* get it's own datatype, a string
representation would allow for a very painless transition...

greetings, Florian Pflug



"Florian G. Pflug" <fgp@phlo.org> writes:
> What about a string representation? Something like
> sessionId/localTransactionId?  Should we ever decide that indeed this
> *should* get it's own datatype, a string representation would allow
> for a very painless transition...

Yeah, that's probably the best way.
        regards, tom lane