Visibility Groups - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Visibility Groups |
Date | |
Msg-id | 1218105518.4549.365.camel@ebony.2ndQuadrant Whole thread Raw |
Responses |
Re: Visibility Groups
Re: Visibility Groups Re: Visibility Groups Re: Visibility Groups Re: Visibility Groups |
List | pgsql-hackers |
Currently, we calculate a single OldestXmin across all snapshots on the assumption that any transaction might access any table. I propose creating "Visibility Groups" that *explicitly* limit the ability of a transaction to access data outside its visibility group(s). By default, visibility_groups would be NULL, implying potential access to all tables. Once set, any attempt to lock an object outside of a transactions defined visibility_groups will result in an error: ERROR attempt to lock table outside of visibility group(s): foo HINT youneed to set a different value for visibility_groups A transaction can only ever reduce or restrict its visibility_groups, it cannot reset or add visibility groups. This would give us the ability to explicitly prevent long running statements from interfering with updates of critical tables, when those tables are not accessed. The visibility_groups list would be held on each PGPROC, so that a snapshot will be able to calculate an xmin for each visibility group. When checking visibility of rows using HeapTupleSatisfiesVacuum() we would use the oldest xmin for the visibility group of the table, rather than the single global value. If no visibility groups are in use then everything happens just as it does now, with only a single "if" test in GetSnapshotData() and HeapTupleSatisfiesVacuum(). Patch would require providing info through to HeapTupleSatisfiesVacuum() to allow it to resolve the appropriate xmin to use for visibility checks, since it will have more than one to choose from. Various ways of doing that. We might like the visibility group to be set automatically but that seems like a harder problem. I do not propose to solve that here. This general idea has been proposed before, but we always get hung up on our inability to automatically determine the visibility group. Let's just do this explicitly, so we can get benefit in the cases where we know very clearly which tables we'll access and more importantly, which we won't. How do we specify visibility groups? Well various ways, various syntax, so that is up for discussion and debate. This might be the place the concept falls down, maybe where it starts having wings. * SET visibility_groups = table, schema.* * define visibility groups using a function: create_visibility_group('name', 'table/schema list') * specify them via ALTER TABLE etc This idea is connected somewhat to replication, so floating it now to see how viable a concept people think this is. I'm floating the idea in a fairly neutral way in the hope that it leads others to even better/more workable proposals, possibly now, possibly over the next few years. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
pgsql-hackers by date: