Thread: Cursors and backwards scans and SCROLL
Postgres' implementation of cursors has always had a problem with doing MOVE or FETCH backwards on complex queries. It works okay for simple seqscans and indexscans, but fails for plans involving joins, aggregates, and probably other cases. This happens because the executor routines for those plan types don't cope with being asked to scan backwards. Fixing this directly seems unreasonably difficult, so I'm currently working on fixing it by inserting a Materialize plan node at the top of the plan tree for a cursor, if the plan tree couldn't otherwise support backwards scan. The Materialize node will save aside a copy of each row as it's fetched from the underlying plan, and use this copied table if any backwards scanning is asked for. Of course, copying the data is a waste of cycles if the client never actually asks to scan backwards --- and it could even lead to failures, i.e. running out of disk space. So I think there needs to be a way to control whether Materialize gets added or not. The SQL spec has a provision for this: according to the spec, the client is supposed to sayDECLARE foo SCROLL CURSOR FOR ... if he intends to do anything except sequential fetches from the cursor. Postgres presently allows the SCROLL keyword but ignores it. I'd like to set things up so that Materialize is added only when SCROLL appears (and the query plan can't handle backwards scan without it). However, this is going to create backwards-compatibility issues. We have a few options for what to do: 1. Enforce the SQL spec requirement: error out if backwards fetch is done when SCROLL wasn't given. But this will surely break a lot of existing applications that work perfectly well. 2. Error out only if a backwards fetch is actually attempted on a plan tree that can't handle it (which could only happen if SCROLL wasn't given). This is efficient and flexible, but it exposes implementation details to the user, in that whether an error occurs will depend on which plan the system happens to choose. There are cases where identical cursor definitions might allow or not allow backwards fetch depending on the planner's choices. Notice though that errors could occur only in cases that would silently fail in the present code; so existing applications that work reliably would not see such errors. 3. Create a runtime parameter (GUC variable) which when set causes us to assume SCROLL is present even if it's not stated. Setting this to TRUE would allow existing applications to work without modification; when it's FALSE, we'd enforce the spec behavior. The trouble with this is the TRUE setting would likely cause materialization costs to be paid in very many situations where the client has no intention of fetching backwards. I'm presently leaning to #2, even though it exposes implementation details. I'm open to discussion though. Any preferences? Other ideas? regards, tom lane
Tom, > Postgres' implementation of cursors has always had a problem with doing > MOVE or FETCH backwards on complex queries. Coincidnetally enough, I was just chatting with one of my contractors yesterday about how the one thing that Transact-SQL has to offer is a really good cursor implementation. It would be lovely to improve ours to match. > Fixing this directly seems unreasonably difficult, so I'm currently > working on fixing it by inserting a Materialize plan node at the top of > the plan tree for a cursor, if the plan tree couldn't otherwise support > backwards scan. The Materialize node will save aside a copy of each row > as it's fetched from the underlying plan, and use this copied table if > any backwards scanning is asked for. Sounds good to me. It's also very similar to what T-SQL does for a STATIC or KEYSET cursor, and works very well in their implementation. (FWIW, T-SQL's cursor types, such as DYNAMIC and KEYSET, are unnecessary for Postgres due to MVCC) > 2. Error out only if a backwards fetch is actually attempted on a plan > tree that can't handle it (which could only happen if SCROLL wasn't <snip> > I'm presently leaning to #2, even though it exposes implementation > details. I'm open to discussion though. Any preferences? Other ideas? This sounds like a good idea to me in a staggered-implementation sense if it's doable. That is, we'd implement the behavior in #2 in the next version of Postgresql, and the behavior in #1 or in #3 in the version after that. If, however, the implementation of #2 is too difficult, then I think #3 would be a good choice. From my perspective, the "SCROLL" declaration has *always* been the SQL-spec, and it is the behaviour used by other databases, even if it's been superflous in PostgreSQL until now. So from that point of view, developers who have been not using "SCROLL" have been sloppy and can reasonably expect to have to audit their code in future versions of PostgreSQL. On the other hand, I don't use cursors much in Postgres, so I'm kind of a priest doing marriage counselling as far as that's concerned. PL/pgSQL's "FOR record IN query" is currently both easier and faster than cursors so I use that 90% of the time. -- Josh Berkus Aglio Database Solutions San Francisco
> I'm presently leaning to #2, even though it exposes implementation > details. I'm open to discussion though. Any preferences? Other ideas? How about a variable that turns on or off spec enforcement (case #1 or #2). On for 7.4, off for 7.5 the next release, and make it disappear after that. Enforcing spec seems like the least confusing mode to operate under, especially given it could break simply by changing the plan -- which happens automagically (seemingly random). -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor <rbt@rbt.ca> writes: > Enforcing spec seems like the least confusing mode to operate under, > especially given it could break simply by changing the plan -- which > happens automagically (seemingly random). Keep in mind though that complaints about the current bugs have been fairly infrequent, which means that most people either don't try to fetch backwards from a cursor, or don't try to do so on complex plans. I'm hesitant to penalize a larger group for the benefit of a smaller one --- which is why enforcing the spec strictly doesn't really appeal to me, even though one could argue that the larger group had it coming. I'd prefer to be permissive about the cases that we can support at no cost, which not-by-coincidence are the cases that have worked correctly up to now. regards, tom lane
On Sun, 2003-03-09 at 16:04, Rod Taylor wrote: > How about a variable that turns on or off spec enforcement (case #1 or > #2). On for 7.4, off for 7.5 the next release, and make it disappear > after that. Yeah, that sounds like the best solution to me. IMHO there is a case to be made for skipping straight to defaulting to 'on' in 7.4, and removing the GUC var in 7.5 Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
> 3. Create a runtime parameter (GUC variable) which when set causes us > to assume SCROLL is present even if it's not stated. Setting this > to TRUE would allow existing applications to work without modification; > when it's FALSE, we'd enforce the spec behavior. The trouble with this > is the TRUE setting would likely cause materialization costs to be paid > in very many situations where the client has no intention of fetching > backwards. I'd be in favour of creating whole sets of backwards-compatibility GUC's whenever we break backwards compatibility. eg. use_72_compat = yes use_73_compat = yes or something. And then we can define all the things that those variables will affect. That would mean that people can upgrade to new db engine (say 7.3) without worrying about all the failures in their applications ( eg ''::int)... Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > I'd be in favour of creating whole sets of backwards-compatibility GUC's > whenever we break backwards compatibility. > > eg. > use_72_compat = yes > use_73_compat = yes That sounds like a recipe for a maintenance nightmare to me. -Doug
> > I'd be in favour of creating whole sets of backwards-compatibility GUC's > > whenever we break backwards compatibility. > > > > eg. > > use_72_compat = yes > > use_73_compat = yes > > That sounds like a recipe for a maintenance nightmare to me. We only have to keep them for one major version, say. eg. 7.3 release would remove the 71_compat option... Chris
Tom Lane kirjutas P, 09.03.2003 kell 22:35: > However, this is going to create backwards-compatibility issues. > We have a few options for what to do: > > 1. Enforce the SQL spec requirement: error out if backwards fetch is > done when SCROLL wasn't given. But this will surely break a lot > of existing applications that work perfectly well. We could start by issuing a NOTICE/WARNING as a step towards the strict compliance and provide a GUC variable for those who are unable to cope even with the warning. The warning could be issued at two places - 1. ( more annoying ) issue it at cursor creation time when a plan is chosen that can't be fetched backwards. 2. like your #2, but just give a warning and then run the underlying query _again_, this toime with materialize on top and also do a Move to reposition the cursor. This will probably not work correctly for all tranasaction isolation levels though but it will penalize only these cases that absolutely need it. The penalty will of course be heavier ;( ----------------- Hannu
Tom Lane wrote: > > However, this is going to create backwards-compatibility issues. > We have a few options for what to do: > > 1. Enforce the SQL spec requirement: error out if backwards fetch is > done when SCROLL wasn't given. But this will surely break a lot > of existing applications that work perfectly well. > > 2. Error out only if a backwards fetch is actually attempted on a plan > tree that can't handle it (which could only happen if SCROLL wasn't > given). This is efficient and flexible, but it exposes implementation > details to the user, in that whether an error occurs will depend on > which plan the system happens to choose. There are cases where > identical cursor definitions might allow or not allow backwards fetch > depending on the planner's choices. Notice though that errors could > occur only in cases that would silently fail in the present code; so > existing applications that work reliably would not see such errors. > > 3. Create a runtime parameter (GUC variable) which when set causes us > to assume SCROLL is present even if it's not stated. Setting this > to TRUE would allow existing applications to work without modification; > when it's FALSE, we'd enforce the spec behavior. The trouble with this > is the TRUE setting would likely cause materialization costs to be paid > in very many situations where the client has no intention of fetching > backwards. The change seem worth neither #1 nor #3. regards, Hiroshi Inouehttp://www.geocities.jp/inocchichichi/psqlodbc/
>> 2. Error out only if a backwards fetch is actually attempted on a plan >> tree that can't handle it (which could only happen if SCROLL wasn't >> given). This is efficient and flexible, but it exposes implementation >> details to the user, in that whether an error occurs will depend on >> which plan the system happens to choose. There are cases where >> identical cursor definitions might allow or not allow backwards fetch >> depending on the planner's choices. Notice though that errors could >> occur only in cases that would silently fail in the present code; so >> existing applications that work reliably would not see such errors. > 2. like your #2, I vote #2 also. > but just give a warning and then run the underlying > query _again_, this toime with materialize on top and also do a Move to > reposition the cursor. This will probably not work correctly for all > tranasaction isolation levels though but it will penalize only these > cases that absolutely need it. The penalty will of course be > heavier ;( rescan can only work in serializable isolation, no? In committed read isolation you will see different (just comitted) rows from a rescan. Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: >> but just give a warning and then run the underlying >> query _again_, this toime with materialize on top and also do a Move to >> reposition the cursor. This will probably not work correctly for all >> tranasaction isolation levels though but it will penalize only these >> cases that absolutely need it. The penalty will of course be >> heavier ;( > rescan can only work in serializable isolation, no? I had thought about this and concluded it was not worth the trouble. It could be made to work if we copy the snapshot data from old plan to new, but realistically there's no value in it. Existing applications that are successfully using backwards-fetch are using queries that don't need materialization; therefore there's no compatibility gain from adding this extra code. regards, tom lane
On Sun, Mar 09, 2003 at 03:35:11PM -0500, Tom Lane wrote: > > 2. Error out only if a backwards fetch is actually attempted on a plan > tree that can't handle it (which could only happen if SCROLL wasn't > given). This is efficient and flexible, but it exposes implementation > details to the user, in that whether an error occurs will depend on > which plan the system happens to choose. I wouldn't worry too much about exposing implementation details at present--the existing situation does the same, except it pretends that not returning data that should be there isn't an error. Absent any further documentation, I'd call that a bug that needs to be fixed. > There are cases where > identical cursor definitions might allow or not allow backwards fetch > depending on the planner's choices. Notice though that errors could > occur only in cases that would silently fail in the present code; so > existing applications that work reliably would not see such errors. Would it be possible to give warnings in a narrow superset of the problematic cases, something along the lines of "I'm scrolling backwards for you now, but there's no reason why that should work on this same query tomorrow"? Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: >> There are cases where >> identical cursor definitions might allow or not allow backwards fetch >> depending on the planner's choices. > Would it be possible to give warnings in a narrow superset of the > problematic cases, something along the lines of "I'm scrolling backwards > for you now, but there's no reason why that should work on this same query > tomorrow"? I don't see a practical way to do that --- that little bit of warning code would have to embed a lot of fragile assumptions about the set of alternatives searched by the planner. It would probably break every time we improved the planner. And the breakage would consist either of failing to give a warning when one is appropriate, or giving a warning when no other plan is really likely to be chosen; neither of which are going to be easily noticed or tested for. Seems like a losing game :-( regards, tom lane
Just a reminder that we could use cursors that exist after transaction commit (WITH HOLD) and updatable cursors (WHERE CURRENT OF cursorname). :-) --------------------------------------------------------------------------- Tom Lane wrote: > "Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > >> There are cases where > >> identical cursor definitions might allow or not allow backwards fetch > >> depending on the planner's choices. > > > Would it be possible to give warnings in a narrow superset of the > > problematic cases, something along the lines of "I'm scrolling backwards > > for you now, but there's no reason why that should work on this same query > > tomorrow"? > > I don't see a practical way to do that --- that little bit of warning > code would have to embed a lot of fragile assumptions about the set of > alternatives searched by the planner. It would probably break every > time we improved the planner. And the breakage would consist either of > failing to give a warning when one is appropriate, or giving a warning > when no other plan is really likely to be chosen; neither of which are > going to be easily noticed or tested for. Seems like a losing game :-( > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073