Thread: Visibility Groups
Currently, we calculate a single OldestXmin across all snapshots on the assumption that any transaction might access any table. I propose creating "Visibility Groups" that *explicitly* limit the ability of a transaction to access data outside its visibility group(s). By default, visibility_groups would be NULL, implying potential access to all tables. Once set, any attempt to lock an object outside of a transactions defined visibility_groups will result in an error: ERROR attempt to lock table outside of visibility group(s): foo HINT youneed to set a different value for visibility_groups A transaction can only ever reduce or restrict its visibility_groups, it cannot reset or add visibility groups. This would give us the ability to explicitly prevent long running statements from interfering with updates of critical tables, when those tables are not accessed. The visibility_groups list would be held on each PGPROC, so that a snapshot will be able to calculate an xmin for each visibility group. When checking visibility of rows using HeapTupleSatisfiesVacuum() we would use the oldest xmin for the visibility group of the table, rather than the single global value. If no visibility groups are in use then everything happens just as it does now, with only a single "if" test in GetSnapshotData() and HeapTupleSatisfiesVacuum(). Patch would require providing info through to HeapTupleSatisfiesVacuum() to allow it to resolve the appropriate xmin to use for visibility checks, since it will have more than one to choose from. Various ways of doing that. We might like the visibility group to be set automatically but that seems like a harder problem. I do not propose to solve that here. This general idea has been proposed before, but we always get hung up on our inability to automatically determine the visibility group. Let's just do this explicitly, so we can get benefit in the cases where we know very clearly which tables we'll access and more importantly, which we won't. How do we specify visibility groups? Well various ways, various syntax, so that is up for discussion and debate. This might be the place the concept falls down, maybe where it starts having wings. * SET visibility_groups = table, schema.* * define visibility groups using a function: create_visibility_group('name', 'table/schema list') * specify them via ALTER TABLE etc This idea is connected somewhat to replication, so floating it now to see how viable a concept people think this is. I'm floating the idea in a fairly neutral way in the hope that it leads others to even better/more workable proposals, possibly now, possibly over the next few years. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, Aug 7, 2008 at 12:38 PM, Simon Riggs wrote: > I propose creating "Visibility Groups" that *explicitly* limit the > ability of a transaction to access data outside its visibility group(s). Doesn't every transaction need to access data from the catalogs? Wouldn't the inclusion of a catalogs visibility group in every transaction negate any potential benefits? > Once set, any attempt to lock an object outside of a transactions > defined visibility_groups will result in an error: Or is this about locking and not about data access? Jochem
On Thu, 2008-08-07 at 12:55 +0200, Jochem van Dieten wrote: > On Thu, Aug 7, 2008 at 12:38 PM, Simon Riggs wrote: > > I propose creating "Visibility Groups" that *explicitly* limit the > > ability of a transaction to access data outside its visibility group(s). > > Doesn't every transaction need to access data from the catalogs? > Wouldn't the inclusion of a catalogs visibility group in every > transaction negate any potential benefits? True, but I don't see the catalogs as frequently updated data. The objective is to isolate frequently updated tables from long running statements that don't need to access them. Tables can be in multiple visibility groups, perhaps that wasn't clear. When we seek to vacuum a table, we take the lowest xmin of any group it was in when we took snapshot. e.g. Long running statement accesses table L, so must access table L and catalog tables only. We set this explicitly. Frequently updated table F is accessed by general transactions that have set no visibility group, i.e. the implicit group is "all tables". So catalog and table L would be in in two groups, while F in only one. As a result, the xmin used for table F will be later than the one used for table L because the long running statement's transaction is not included in the calculation of the xmin for table F. The transaction accessing L has explicitly defined the limit of its access, so removing rows from F is possible without breaking MVCC. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
"Simon Riggs" <simon@2ndquadrant.com> writes: > Currently, we calculate a single OldestXmin across all snapshots on the > assumption that any transaction might access any table. > > I propose creating "Visibility Groups" that *explicitly* limit the > ability of a transaction to access data outside its visibility group(s). > By default, visibility_groups would be NULL, implying potential access > to all tables. > > Once set, any attempt to lock an object outside of a transactions > defined visibility_groups will result in an error: > ERROR attempt to lock table outside of visibility group(s): foo > HINT you need to set a different value for visibility_groups > A transaction can only ever reduce or restrict its visibility_groups, it > cannot reset or add visibility groups. Hm, so backing up a bit from the specific proposed interface, the key here is being able to explicitly mark which tables your transaction will need in the future? Is it always just a handful of heavily updated tables that you want to protect? In that case we could have a lock type which means "I'll never need to lock this object". Then a session could issue "LOCK TABLE foo IN INACCESSIBLE MODE" or something like that. That requires people to hack up their pg_dump or replication script though which might be awkward. Perhaps the way to do that would be to preemptively take locks on all the objects that you'll need, then have a command to indicate you won't need any further objects beyond those. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
Simon Riggs wrote: > On Thu, 2008-08-07 at 12:55 +0200, Jochem van Dieten wrote: >> On Thu, Aug 7, 2008 at 12:38 PM, Simon Riggs wrote: >>> I propose creating "Visibility Groups" that *explicitly* limit the >>> ability of a transaction to access data outside its visibility group(s). >> Doesn't every transaction need to access data from the catalogs? >> Wouldn't the inclusion of a catalogs visibility group in every >> transaction negate any potential benefits? > > True, but I don't see the catalogs as frequently updated data. The > objective is to isolate frequently updated tables from long running > statements that don't need to access them. > > Tables can be in multiple visibility groups, perhaps that wasn't clear. > When we seek to vacuum a table, we take the lowest xmin of any group it > was in when we took snapshot. I'm not sure if "visibility group" is the best name for this - I had to go away and think through what you meant about that last bit. Have I got this right? So - a "visibility group" is attached to a transaction. My long-running transaction T0 can restrict itself to <catalogues> and table "event_log". Various other transactions T1..Tn make no promises about what they are going to access. They all share the "null visibility group". A table "user_emails" is in the "null visibility group" and can be vacuumed based on whatever the lowest xid of T1..Tn is. Table "event_log" is in both groups and can only be vacuumed based on T0..Tn (presumably T0 is the oldest, since that's the point of the exercise). An attempt to write to user_emails by T0 will fail with an error. An attempt to read from user_emails by T0 will be allowed? What happens if I'm in ISOLATION LEVEL SERIALIZABLE? Presumably the read is disallowed then too? -- Richard Huxton Archonet Ltd
On Thu, 2008-08-07 at 13:30 +0100, Gregory Stark wrote: > Hm, so backing up a bit from the specific proposed interface, the key here is > being able to explicitly mark which tables your transaction will need in the > future? Think of it as a promise to touch nothing except a specific set of tables. Whether you lock/access the full set of tables in the visibility group is up to you. Visibility groups would likely have overhead, so specifying them more loosely would allow reasonable administration, yet retain benefit. > Is it always just a handful of heavily updated tables that you want to > protect? Possibly. It seems easier to specify what you might touch, since security may prevent you even knowing about the existence of other tables. > In that case we could have a lock type which means "I'll never need > to lock this object". Then a session could issue "LOCK TABLE foo IN > INACCESSIBLE MODE" or something like that. That requires people to hack up > their pg_dump or replication script though which might be awkward. Possibly The main point is this information needs to be available on PGPROC, so that new snapshots can see the visibility groups and then calculate OldestXmin for each object as a result. > Perhaps the way to do that would be to preemptively take locks on all the > objects that you'll need, then have a command to indicate you won't need any > further objects beyond those. I seem to recall we track which objects we have already locked. Perhaps we would use the explicit visibility group to mark all objects we are allowed to lock, or mark those to avoid, depending upon which is smaller list. Then any attempt to lock a new object would be straightforwardly refused. We must do that cheaply and so that precludes any use of shared resources such as the lock table. If you'd like to run with this, I doubt I will have time to do anything with it this release. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Thu, 2008-08-07 at 14:18 +0100, Richard Huxton wrote: > Simon Riggs wrote: > > On Thu, 2008-08-07 at 12:55 +0200, Jochem van Dieten wrote: > >> On Thu, Aug 7, 2008 at 12:38 PM, Simon Riggs wrote: > >>> I propose creating "Visibility Groups" that *explicitly* limit the > >>> ability of a transaction to access data outside its visibility group(s). > >> Doesn't every transaction need to access data from the catalogs? > >> Wouldn't the inclusion of a catalogs visibility group in every > >> transaction negate any potential benefits? > > > > True, but I don't see the catalogs as frequently updated data. The > > objective is to isolate frequently updated tables from long running > > statements that don't need to access them. > > > > Tables can be in multiple visibility groups, perhaps that wasn't clear. > > When we seek to vacuum a table, we take the lowest xmin of any group it > > was in when we took snapshot. > > I'm not sure if "visibility group" is the best name for this - I had to > go away and think through what you meant about that last bit. Have I got > this right? > > So - a "visibility group" is attached to a transaction. Perhaps visibility_scope might be better name. See below. > My long-running transaction T0 can restrict itself to <catalogues> and > table "event_log". > > Various other transactions T1..Tn make no promises about what they are > going to access. They all share the "null visibility group". OK, good example. > A table "user_emails" is in the "null visibility group" and can be > vacuumed based on whatever the lowest xid of T1..Tn is. > > Table "event_log" is in both groups and can only be vacuumed based on > T0..Tn (presumably T0 is the oldest, since that's the point of the > exercise). > > An attempt to write to user_emails by T0 will fail with an error. All above correct The point of doing this is that *if* T0 becomes the oldest transaction it will *not* interfere with removal of rows on "user_emails". > An attempt to read from user_emails by T0 will be allowed? No, reads must also be excluded otherwise MVCC will be violated. > What happens if I'm in ISOLATION LEVEL SERIALIZABLE? Presumably the read > is disallowed then too? No, that's not relevant. That is your choice about how often you update your snapshot of the database. The visibility group refers to the *scope* of the snapshot, so the two things are orthogonal. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
>>> Simon Riggs wrote: > Currently, we calculate a single OldestXmin across all snapshots on the > assumption that any transaction might access any table. > > I propose creating "Visibility Groups" that *explicitly* limit the > ability of a transaction to access data outside its visibility group(s). > By default, visibility_groups would be NULL, implying potential access > to all tables. > > Once set, any attempt to lock an object outside of a transactions > defined visibility_groups will result in an error: > ERROR attempt to lock table outside of visibility group(s): foo > HINT you need to set a different value for visibility_groups > A transaction can only ever reduce or restrict its visibility_groups, it > cannot reset or add visibility groups. > > This would give us the ability to explicitly prevent long running > statements from interfering with updates of critical tables, when those > tables are not accessed. I assume this means that tables outside of all transactions' visibility groups could be vacuumed? This is something which I would rarely use, but it might have been useful just this week. Tom wanted an EXPLAIN ANALYZE of a query which is very long running, but only SELECTs from two tables, and updates nothing. While the costing, if consistent with other queries on the box, suggest that the query would run for years, I was willing to give it a week or two to run just to see if it would finish in that time, but that would have left the database horribly bloated across many tables. As it is a standby machine which might need to be brought into production at any time, that wasn't feasible. This would have let me make the try. > We might like the visibility group to be set automatically but that > seems like a harder problem. I do not propose to solve that here. It certainly seems hard (impossible?) for the general case, but when there is an implicit transaction, which is guaranteed to be a single statement, it seems like an implicit visibility group might not be out of the question, and that might help in a lot of situations. -Kevin
Simon Riggs wrote: > On Thu, 2008-08-07 at 14:18 +0100, Richard Huxton wrote: >> >> An attempt to write to user_emails by T0 will fail with an error. > > All above correct > > The point of doing this is that *if* T0 becomes the oldest transaction > it will *not* interfere with removal of rows on "user_emails". > >> An attempt to read from user_emails by T0 will be allowed? > > No, reads must also be excluded otherwise MVCC will be violated. Ah good - I was wondering, but I read your first email as allowing reads. >> What happens if I'm in ISOLATION LEVEL SERIALIZABLE? Presumably the read >> is disallowed then too? > > No, that's not relevant. That is your choice about how often you update > your snapshot of the database. The visibility group refers to the > *scope* of the snapshot, so the two things are orthogonal. So - effectively we're partitioning the database into (possibly overlapping) subsets of tables. Would it simplify things at all to have a centrally-defined list of "visibility scopes" (or "groups") which your transaction/user can access? As a DBA, I'd rather have somewhere central to manage this, and I'd probably make it per-user anyway. -- Richard Huxton Archonet Ltd
Simon Riggs wrote: > > Currently, we calculate a single OldestXmin across all snapshots on the > assumption that any transaction might access any table. > > I propose creating "Visibility Groups" that *explicitly* limit the > ability of a transaction to access data outside its visibility group(s). > By default, visibility_groups would be NULL, implying potential access > to all tables. I think this is a cumbersome thing to use. We can do better -- right now we keep closer track of open snapshots, which means that if there is a long running transaction that refreshes its snapshots periodically, vacuum doesn't need to keep all the dead rows that it can no longer see. We can improve vacuum to be able to remove a lot more dead rows than we do currently. This is invisible to the user, which IMHO is a better user interface than what you are proposing. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, 2008-08-07 at 10:20 -0400, Alvaro Herrera wrote: > Simon Riggs wrote: > > > > Currently, we calculate a single OldestXmin across all snapshots on the > > assumption that any transaction might access any table. > > > > I propose creating "Visibility Groups" that *explicitly* limit the > > ability of a transaction to access data outside its visibility group(s). > > By default, visibility_groups would be NULL, implying potential access > > to all tables. > > I think this is a cumbersome thing to use. We can do better -- right > now we keep closer track of open snapshots, which means that if there is > a long running transaction that refreshes its snapshots periodically, > vacuum doesn't need to keep all the dead rows that it can no longer see. > > We can improve vacuum to be able to remove a lot more dead rows than we > do currently. This is invisible to the user, which IMHO is a better > user interface than what you are proposing. Agreed, but this is a step even beyond that. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > I propose creating "Visibility Groups" that *explicitly* limit the > ability of a transaction to access data outside its visibility group(s). > By default, visibility_groups would be NULL, implying potential access > to all tables. I think this would be a lot of mechanism and complication that will go completely unused in the field. It'll be impossible even to explain let alone to use effectively, for anyone who's not intensely steeped in the details of MVCC. regards, tom lane
> I think this would be a lot of mechanism and complication that will go > completely unused in the field. It'll be impossible even to explain let > alone to use effectively, for anyone who's not intensely steeped in the > details of MVCC. +1. This proposal sounds like it would target batch jobs, because those are the kinds of jobs that where you can predict in advance what tables will be needed. I don't know whether my personal set of problems with MVCC syncs up with anyone else's, but this is rarely how I get bitten. Usually, what happens is that a user session (psql or web server connection) gets left in a transaction for days or weeks. Now the batch jobs (which are doing lots of updates) start creating tons of bloat, but it's not their snapshot that is causing the problem. ...Robert
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Simon Riggs <simon@2ndquadrant.com> writes: >> I propose creating "Visibility Groups" that *explicitly* limit the >> ability of a transaction to access data outside its visibility group(s). >> By default, visibility_groups would be NULL, implying potential access >> to all tables. > > I think this would be a lot of mechanism and complication that will go > completely unused in the field. It'll be impossible even to explain let > alone to use effectively, for anyone who's not intensely steeped in the > details of MVCC. I think Simon's interface was overly complex but if we can simplify it then it could be useful. As Grittner mentioned implicit queries could make use of it automatically. Also pg_dump or Slony could make use of it automatically. And while Alvaro is absolutely right that we should take care of the inaccessible records between an old long-running transaction and more recently started transactions that doesn't really completely solve the problem. If you have a large table consisting of old records which are gradually being replaced with newer records a long-running transaction will prevent any of those old records from being vacuumed. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
"Gregory Stark" <stark@enterprisedb.com> writes: > I think Simon's interface was overly complex but if we can simplify it then it > could be useful. As Grittner mentioned implicit queries could make use of it > automatically. Also pg_dump or Slony could make use of it automatically. Sorry "implicit transactions" -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
Gregory Stark wrote: > I think Simon's interface was overly complex but if we can simplify it then it > could be useful. As Grittner mentioned implicit queries could make use of it > automatically. Also pg_dump or Slony could make use of it automatically. Hmm, what use would it have for pg_dump? Normally, when it is a problem, you want access to a large set of tables, frequently all of them. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, 2008-08-07 at 10:28 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > I propose creating "Visibility Groups" that *explicitly* limit the > > ability of a transaction to access data outside its visibility group(s). > > By default, visibility_groups would be NULL, implying potential access > > to all tables. > > I think this would be a lot of mechanism and complication that will go > completely unused in the field. It'll be impossible even to explain let > alone to use effectively, for anyone who's not intensely steeped in the > details of MVCC. Yes, but early days. Given so many people use Oracle currently, I don't think its a stretch for people to understand internals enough to realise its a Bad Thing. Developers need to know about serializability and such like to write correct applications, in some cases. If not, it's just a simple equationLong Running Statement + Access to Heavily Updated Tables = Bad Thing so having the ability to prevent access to heavily updated tables helps to avoid the Bad Thing. Anyway, the reason for mentioning now was for people to mull on it over a longer period of time. I have zero interest in any specific user interface, so that the idea can morph into something that does work, probably not even written by me. Not pursuing this further right now. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
"Alvaro Herrera" <alvherre@commandprompt.com> writes: > Gregory Stark wrote: > >> I think Simon's interface was overly complex but if we can simplify it then it >> could be useful. As Grittner mentioned implicit queries could make use of it >> automatically. Also pg_dump or Slony could make use of it automatically. > > Hmm, what use would it have for pg_dump? Normally, when it is a > problem, you want access to a large set of tables, frequently all of > them. Well pg_dump -t or pg_restore -L For that matter pg_restore generally knows it's not going to access any existing tables that don't match the tables it's restoring... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
On Thu, Aug 07, 2008 at 01:30:27PM +0100, Gregory Stark wrote: > "Simon Riggs" <simon@2ndquadrant.com> writes: > > > Currently, we calculate a single OldestXmin across all snapshots on the > > assumption that any transaction might access any table. > > > > I propose creating "Visibility Groups" that *explicitly* limit the > > ability of a transaction to access data outside its visibility group(s). > > By default, visibility_groups would be NULL, implying potential access > > to all tables. > > > > Once set, any attempt to lock an object outside of a transactions > > defined visibility_groups will result in an error: > > ERROR attempt to lock table outside of visibility group(s): foo > > HINT you need to set a different value for visibility_groups > > A transaction can only ever reduce or restrict its visibility_groups, it > > cannot reset or add visibility groups. > > Hm, so backing up a bit from the specific proposed interface, the key here is > being able to explicitly mark which tables your transaction will need in the > future? > > Is it always just a handful of heavily updated tables that you want to > protect? In that case we could have a lock type which means "I'll never need > to lock this object". Then a session could issue "LOCK TABLE foo IN > INACCESSIBLE MODE" or something like that. That requires people to hack up > their pg_dump or replication script though which might be awkward. > > Perhaps the way to do that would be to preemptively take locks on all the > objects that you'll need, then have a command to indicate you won't need any > further objects beyond those. +1 -dg -- David Gould daveg@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects.
On Aug 7, 2008, at 9:49 AM, Robert Haas wrote: > This proposal sounds like it would target batch jobs, because those > are the kinds of jobs that where you can predict in advance what > tables will be needed. I don't know whether my personal set of > problems with MVCC syncs up with anyone else's, but this is rarely how > I get bitten. Usually, what happens is that a user session (psql or > web server connection) gets left in a transaction for days or weeks. > Now the batch jobs (which are doing lots of updates) start creating > tons of bloat, but it's not their snapshot that is causing the > problem. We have some cron'd scripts that check for long-running queries, idle in transaction and just plain idle. The scripts will kill sessions when the sit in different states for too long. It would be nice if the database could handle this (no, statement_timeout won't work, because the user can just change it to whatever they want), but I don't know how many other people have this need. I guess I could at least put the scripts up on pgFoundry... -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828