Re: Catalog/Metadata consistency during changeset extraction from wal - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Catalog/Metadata consistency during changeset extraction from wal |
Date | |
Msg-id | 201206251543.40142.andres@2ndquadrant.com Whole thread Raw |
In response to | Re: Catalog/Metadata consistency during changeset extraction from wal (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Catalog/Metadata consistency during changeset
extraction from wal
|
List | pgsql-hackers |
On Monday, June 25, 2012 03:08:51 AM Robert Haas wrote: > On Sun, Jun 24, 2012 at 5:11 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > There are some interesting problems related to locking and snapshots > > here. Not sure if they are resolvable: > > > > We need to restrict SnapshotNow to represent to the view it had back when > > the wal record were currently decoding had. Otherwise we would possibly > > get wrong column types and similar. As were working in the past locking > > doesn't protect us against much here. I have that (mostly and > > inefficiently). > > > > One interesting problem are table rewrites (truncate, cluster, some ALTER > > TABLE's) and dropping tables. Because we nudge SnapshotNow to the past > > view it had back when the wal record was created we get the old > > relfilenode. Which might have been dropped in part of the transaction > > cleanup... > > With most types thats not a problem. Even things like records and arrays > > aren't problematic. More interesting cases include VACUUM FULL $systable > > (e.g. pg_enum) and vacuum full'ing a table which is used in the *_out > > function of a type (like a user level pg_enum implementation). > > > > The only theoretical way I see against that problem would be to postpone > > all relation unlinks untill everything that could possibly read them has > > finished. Doesn't seem to alluring although it would be needed if we > > ever move more things of SnapshotNow. > > > > Input/Ideas/Opinions? > > Yeah, this is slightly nasty. I'm not sure whether or not there's a > way to make it work. Postponing all non-rollback unlinks to the next "logical checkpoint" is the only thing I can think of... > I had another idea. Suppose decoding happens directly on the primary, > because I'm still hoping there's a way to swing that. Suppose further > that we handle DDL by insisting that (1) any backend which wants to > add columns or change the types of existing columns must first wait > for logical replication to catch up and (2) if a backend which has > added columns or changed the types of existing columns then writes to > the modified table, decoding of those writes will be postponed until > transaction commit. I think that's enough to guarantee that the > decoding process can just use the catalogs as they stand, with plain > old SnapshotNow. I don't think its that easy. If you e.g. have multiple ALTER's in the same transaction interspersed with inserted rows they will all have different TupleDesc's. I don't see how thats resolvable without either replicating ddl to the target system or changing what SnapshotNow does... > The downside of this approach is that it makes certain kinds of DDL > suck worse if logical replication is in use and behind. But I don't > necessarily see that as prohibitive because (1) logical replication > being behind is likely to suck for a lot of other reasons too and (2) > adding or retyping columns isn't a terribly frequent operation and > people already expect a hit when they do it. Also, I suspect that we > could find ways to loosen those restrictions at least in common cases > in some future version; meanwhile, less work now. Agreed. Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: