Thread: Comparing txid and xmin (under BDR)

Comparing txid and xmin (under BDR)

From
Peter Mogensen
Date:
Hi,

I would really like to be able to externally to Postgres at some point
in time later, be able to compare the txid of 2 queries.

Namely:
The INSERT transaction for a certain row in a table,  and
The SELECT transaction reading some other data.

With the one caveat that this has to work with BDR.

So ... I can easily get the current txid of the SELECT transaction by
calling txid_current().

However, - I can't in general compare it to the xmin of the table row.
BDR does ensure (it seems) that xmin is meaningfull within the local
node xid context.

But after a xid wrap-around all txids will be larger than any xmin. And
I can't get the epoch of the xmin value.

BUT ... will this work: ?

Comparing
txid_current() of the SELECT transaction,
to
txid-current()-age(xmin) of the table row?


/Peter



Re: Comparing txid and xmin (under BDR)

From
Craig Ringer
Date:


On 11 May 2015 at 21:10, Peter Mogensen <apm@one.com> wrote:
Hi,

I would really like to be able to externally to Postgres at some point in time later, be able to compare the txid of 2 queries.

Namely:
The INSERT transaction for a certain row in a table,  and
The SELECT transaction reading some other data.

With the one caveat that this has to work with BDR.

So ... I can easily get the current txid of the SELECT transaction by calling txid_current().

Note that by doing so, you force txid allocation for a read-only query that might otherwise not need one, which increases your txid burn rate and decreases time until you need to do wraparound-protection vacuuming.

However, - I can't in general compare it to the xmin of the table row.
BDR does ensure (it seems) that xmin is meaningfull within the local node xid context.

Or, rather, it doesn't do anything different to what PostgreSQL its self does.

I'm still not at all convinced that your desire to use transaction IDs for the cache invalidation stuff you're doing makes sense or will work sensibly even in a single-node environment. It's certainly not going to when comparing between nodes, especially in an async system.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Comparing txid and xmin (under BDR)

From
Peter Mogensen
Date:

On 2015-05-12 06:06, Craig Ringer wrote:
> On 11 May 2015 at 21:10, Peter Mogensen <apm@one.com> wrote:
>
>> So ... I can easily get the current txid of the SELECT transaction by
>> calling txid_current().
>>
> Note that by doing so, you force txid allocation for a read-only query that
> might otherwise not need one, which increases your txid burn rate and
> decreases time until you need to do wraparound-protection vacuuming.

Good point.

> Or, rather, it doesn't do anything different to what PostgreSQL its
> self does. I'm still not at all convinced that your desire to use
> transaction IDs for the cache invalidation stuff you're doing makes
> sense or will work sensibly even in a single-node environment. It's
> certainly not going to when comparing between nodes, especially in an
> async system.

You could be right.

Let's recap....

So - for an external cache in a system with high read-to-write ratio we
can get very high cache hit ratios by having no TTL, but doing cache
invalidation.
(about 98% i practice, so xid burn rate is probably not as bad as is may
sound)

Invalidation events have 1 problem. (apart from how they are generated).
There's a race condition.

We need to ensure that this sequence of events does not happen:
1) cache miss
2) DB read
3) DB invalidation event
4) cache invalidation applied
5) caching of the value read in 2)

  In a standard PostgreSQl setup, we can stop 5) from happening, by
attaching txid_current() to the invalidation event in 3) and
txid_snapshot_xmin() to the cache read and put a tombstone in the cache
when doing invalidations.
(with a relative long TTL)

So ... when the value in 5) is about to get cached and if hits a
tombstone in the cache it is only cached if the tombstone txid is older
than the txid_snapshot_xmin of the new read - ie. if were sure that the
invalidation took place before the value we're about to cache.

This scheme should work with standard Postgres. Also read-only slaves.
But it won't work with BDR, since each node got its' own txid namespace.
Attaching txid_current() to an invalidation event has no meaning on
other nodes.

On the other hand. ... it the invalidation event is stored in a BDR
replicated table, then the xmin of the event row will have a meaning to
the local node. Which was the only way I found to get the invalidation
event placed in the local sequence of transactions.

/Peter






Re: Comparing txid and xmin (under BDR)

From
Peter Mogensen
Date:

On 2015-05-12 06:06, Craig Ringer wrote:
> On 11 May 2015 at 21:10, Peter Mogensen <apm@one.com> wrote:
>
>>
>> So ... I can easily get the current txid of the SELECT transaction by
>> calling txid_current().
>>
> Note that by doing so, you force txid allocation for a read-only query that
> might otherwise not need one, which increases your txid burn rate and
> decreases time until you need to do wraparound-protection vacuuming.
>


The same is not true for txid_snapshot_xmin() is it?

I mean ... I really don't need the actual txid of a SELECT statement.
Only to ensure that it's never than any invalidation event.
So it's enough to just use txid_snapshot_xmin() ... at the cost of
possibly not caching new values in a small window after invalidation.

/Peter