Thread: SetQuerySnapshot() for utility statements
I notice that ProcessUtility() calls SetQuerySnapshot() for FETCH and COPY TO statements, and nothing else. Seems to me this is very broken. Isn't a query snapshot needed for any utility command that might do database accesses? regards, tom lane
> I notice that ProcessUtility() calls SetQuerySnapshot() for FETCH > and COPY TO statements, and nothing else. > > Seems to me this is very broken. Isn't a query snapshot needed for > any utility command that might do database accesses? Not needed. We don't support multi-versioning for schema operations. More of that, sometimes it would be better to read *dirty* data from system tables - so, no snapshot required. What is really, hm, not good is that first SetQuerySnapshot defines serializable snapshot for *all* transactions, even for ones with read committed isolevel: in the times of 6.5 I thought about ability to switch between isolevels inside single xaction - this is not required by standard and *bad* for system: just remember that vacuum doesn't clean up deleted tuples if there is some transaction *potentially* interested in them. For read committed xactions must be no serializable snapshot defined and MyProc->xmin must be updated when *each* top-level query begins. Vadim
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes: >> Seems to me this is very broken. Isn't a query snapshot needed for >> any utility command that might do database accesses? > Not needed. We don't support multi-versioning for schema operations. No? Seems to me we're almost there. Look for instance at that DROP USER bug I just fixed: it was failing because it wasn't careful to make sure that during "DROP USER foo,bar", the loop iteration to delete user bar would see the changes the first loop iteration had made. So even though we use a lot of table-level locking rather than true MVCC behavior for schema changes, ISTM that we still have to play by all the rules when it comes to tuple visibility. In particular I suspect we ought to be using standard query snapshot behavior... > More of that, sometimes it would be better to read *dirty* data from > system tables - so, no snapshot required. There may be a small number of places like that, but for generic utility operations like CREATE/DROP USER, I don't see that this is a good idea. regards, tom lane
> I notice that ProcessUtility() calls SetQuerySnapshot() for FETCH > and COPY TO statements, and nothing else. > > Seems to me this is very broken. Isn't a query snapshot needed for > any utility command that might do database accesses? Not needed. We don't support multi-versioning for schema operations. More of that, sometimes it would be better to read *dirty* data from system tables - so, no snapshot required. What is really, hm, not good is that first SetQuerySnapshot defines serializable snapshot for *all* transactions, even for ones with read committed isolevel: in the times of 6.5 I thought about ability to switch between isolevels inside single xaction - this is not required by standard and *bad* for system: just remember that vacuum doesn't clean up deleted tuples if there is some transaction *potentially* interested in them. For read committed xactions must be no serializable snapshot defined and MyProc->xmin must be updated when *each* top-level query begins. Vadim
> >> Seems to me this is very broken. Isn't a query snapshot needed for > >> any utility command that might do database accesses? > > > Not needed. We don't support multi-versioning for schema operations. > > No? Seems to me we're almost there. Look for instance at that DROP > USER bug I just fixed: it was failing because it wasn't careful to make > sure that during "DROP USER foo,bar", the loop iteration to delete user > bar would see the changes the first loop iteration had made. So even ^^^^^^^^^^^^^^^^^^^ Snapshot defines visibility of changes made by other transactions. Seems that you talk here about self-visibility, defined by CommandId. > though we use a lot of table-level locking rather than true MVCC > behavior for schema changes, ISTM that we still have to play by all the > rules when it comes to tuple visibility. In particular I suspect we > ought to be using standard query snapshot behavior... What would it buy for us? MVCC lies to user - it returns view of data as they were some time ago. What would we get by seeing old view of catalog? > > More of that, sometimes it would be better to read *dirty* data from > > system tables - so, no snapshot required. > > There may be a small number of places like that, but for generic utility > operations like CREATE/DROP USER, I don't see that this is a good idea. But your fix for DROP USER didn't change anything about snapshot used by scans so it's not good example. Vadim
"Vadim Mikheev" <vmikheev@sectorbase.com> writes: >> bar would see the changes the first loop iteration had made. So even > ^^^^^^^^^^^^^^^^^^^ > Snapshot defines visibility of changes made by other transactions. > Seems that you talk here about self-visibility, defined by CommandId. Sure. The example was just to point out that we do have tuple visibility rules, even in utility statements. >> though we use a lot of table-level locking rather than true MVCC >> behavior for schema changes, ISTM that we still have to play by all the >> rules when it comes to tuple visibility. In particular I suspect we >> ought to be using standard query snapshot behavior... > What would it buy for us? MVCC lies to user - it returns view of data > as they were some time ago. What would we get by seeing old > view of catalog? Consistency. For example: pg_dump wants a consistent view of the database, so it runs in a serializable transaction. To the extent that it uses utility statements rather than standard SELECTs to look at the state of the system catalogs, it will get the wrong answer if the utility statements believe that they can ignore the transaction isolation mode setting. I'm not sure that there are any utility statements that would be useful for pg_dump, but certainly there could be such a thing, no? regards, tom lane
> >> though we use a lot of table-level locking rather than true MVCC > >> behavior for schema changes, ISTM that we still have to play by all the > >> rules when it comes to tuple visibility. In particular I suspect we > >> ought to be using standard query snapshot behavior... > > > What would it buy for us? MVCC lies to user - it returns view of data > > as they were some time ago. What would we get by seeing old > > view of catalog? > > Consistency. For example: pg_dump wants a consistent view of the > database, so it runs in a serializable transaction. To the extent that > it uses utility statements rather than standard SELECTs to look at the > state of the system catalogs, it will get the wrong answer if the > utility statements believe that they can ignore the transaction > isolation mode setting. > > I'm not sure that there are any utility statements that would be useful > for pg_dump, but certainly there could be such a thing, no? I'm not sure are there any utilities just to look into catalog, seems all of them to change them. So, is this reason to change heap_beginscan parameters in dozens places? (Note! If you just need to call SetQuerySnapshot once per statement please do it - it doesn't matter for subject of this thread). Now examples why old catalog view is bad: what if serializable xaction S tries to use index I to read from user table, and meanwhile another xaction deleted this index. Any reason to abort S? But we'll do this. And what if index I over int column was changed to index with name "I" but over text column? Or function F now needs in 3 args instead of 2 args as it was when S started? Let me repeat - we don't support multi-versioning of metadata (schema). So, changing snapshot parameter of heap_beginscan used for internal ops we'll not make us more happy than we are not. Vadim