Re: Make unlogged table resets detectable - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Make unlogged table resets detectable
Date
Msg-id ce599380b68de663b7e6cd8531f3a82331ec13ac.camel@j-davis.com
Whole thread Raw
In response to Re: Make unlogged table resets detectable  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2021-06-08 at 16:08 -0400, Tom Lane wrote:
> Since we don't put LSNs into unlogged tables, nor would the different
> shards be likely to have equivalent LSNs, I'm not seeing that LSN is
> remarkably better for this than a timestamp.

It requires some other bookkeeping on the part of the sharding
solution. This is ugly (alternative suggestions welcome), but I think
it would work:

1. The sharding code would create on each node:
  CREATE UNLOGGED TABLE unlogged_table_status(
    shard_name regclass,
    last_truncate pg_lsn);

2. When you create an unlogged table, each node would do:
  INSERT INTO unlogged_table_status
  VALUES('my_unlogged_shard', pg_current_wal_flush_lsn())

3. When you TRUNCATE an unlogged table, each node would do:
  UPDATE unlogged_table_status
  SET last_truncate=pg_current_wal_flush_lsn()
  WHERE shard_name='my_unlogged_shard'

4. When connecting to a node and accessing a shard of an unlogged table
for the first time, test whether the shard has been lost with:
  SELECT
  last_truncate <= (pg_control_recovery()).last_recovery_lsn
    AS shard_was_lost
  FROM unlogged_table_status
  WHERE shard_name='my_unlogged_shard'

5. If the shard was lost, truncate all shards for that table on all
nodes (and update the unlogged_table_status on all nodes as in #3).

Not exactly straightforward, but better than the current situation. And
I think it can be made more robust than a timestamp.

> I'd be kind of inclined to say no, short of pg_resetwal, and maybe
> not then.

Agreed, at least until we find some use case that says otherwise.

> > 2. It would be helpful to also know the last time a promotion
> > happened,
> 
> I'm not following this either.  How do you unpromote a node?

What I meant by "node" here is actually a primary+standby pair. Let's
say each primary+standby pair holds one shard of an unlogged table.

In this case, a crash followed by restart is equivalent to a primary
failing over to a promoted standby -- in either case, the shard is
gone, but other shards of the same table may be populated on other
primaries. We need to detect that the shard is gone and then wipe out
all the other shards on the healthy primaries.

You could reasonably say that it's the job of the sharding solution to
keep track of these crashes and handle unlogged tables at the time. But
it's inconvenient to insert more tasks into a sensitive process like
failover/recovery. It's preferable to be able to detect the unlogged
table problem after the fact and handle it when the systems are all up
and stable.

Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic
Next
From: Peter Geoghegan
Date:
Subject: Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic