Thread: [PATCH] Add UPDATE WHERE OFFSET IN clause
Hi! A couple of years ago I was working on PostgreSQL features and extensions. Some of them was in ECPG, mainly in the area of improving Informix compatibility. One of the patchsets I invented was the cursor readahead support to improve performance in ECPG which used FETCH N. This work was published at https://github.com/zboszor/ecpg-readahead/commits However, there was one use case for this patchset actually decreased performance, namely if the cursor was used to modify rows via UPDATE WHERE CURRENT OF. It was because the cursor position on the server side was different from the application so the ECPG readahead code had to use MOVE then UPDATE WHERE CURRENT OF. I brewed this idea for a while and yesterday I decided to do something about it and here's the (admittedly, limited) result. I added a new PostgreSQL extension to the UPDATE syntax: UPDATE ... WHERE OFFSET n IN cursor; This new syntax changes the feature disparity between FETCH N and UPDATE WHERE CURRENT OF that existed in PostgreSQL forever. I only implemented this for cursors that use SELECT FOR UPDATE. This part was quite straightforward to implement. The behaviour of this syntax is as follows: * the offset value may be 0 or negative, since it is a virtual index into the rows returned by the last FETCH statement and negative indexes are felt natural relative to the cursor position and the cursor direction * for offset value 0, the behaviour is identical to WHERE CURRENT OF * negative indexes allow UPDATEs on previous rows even if the cursor reached the end of the result set I need clues for how to extend this for cursors that aren't using FOR UPDATE queries, if it's possible at all. Please, review. Best regards, Zoltán Böszörményi
Attachment
Hi! On 02/07/22 00:59, Böszörményi Zoltán wrote: > UPDATE ... WHERE OFFSET n IN cursor; If added to UPDATE, should this be added to DELETE also? Regards, -Chap
Chapman Flack <chap@anastigmatix.net> writes: > On 02/07/22 00:59, Böszörményi Zoltán wrote: >> UPDATE ... WHERE OFFSET n IN cursor; > If added to UPDATE, should this be added to DELETE also? FWIW, I think this is a really horrid hack. For one thing, it's not robust against not-strictly-linear FETCH/MOVE of the cursor. It looks to me like "OFFSET n" really means "the row that we read N reads ago", not "the row N before the current cursor position". I see that the documentation does explain it that way, but it fails to account for optimizations such as whether we implement moves by reading backwards or rewind-and-read-forwards. I don't think we want to expose that sort of implementation detail. I'm also pretty displeased with causing unbounded memory consumption for every use of nodeLockRows, whether it has anything to do with a cursor or not (never mind whether the cursor will ever be used for WHERE OFFSET IN). Yeah, it's only a few bytes per row, but that will add up in queries that process lots of rows. regards, tom lane
2022. 02. 08. 2:05 keltezéssel, Tom Lane írta: > Chapman Flack <chap@anastigmatix.net> writes: >> On 02/07/22 00:59, Böszörményi Zoltán wrote: >>> UPDATE ... WHERE OFFSET n IN cursor; > >> If added to UPDATE, should this be added to DELETE also? Yes, it should be added, too. > FWIW, I think this is a really horrid hack. Thanks for your kind words. :-D > For one thing, it's not > robust against not-strictly-linear FETCH/MOVE of the cursor. It looks > to me like "OFFSET n" really means "the row that we read N reads ago", > not "the row N before the current cursor position". I see that the > documentation does explain it that way, but it fails to account for > optimizations such as whether we implement moves by reading backwards > or rewind-and-read-forwards. I don't think we want to expose that > sort of implementation detail. > > I'm also pretty displeased with causing unbounded memory consumption for > every use of nodeLockRows, whether it has anything to do with a cursor or > not (never mind whether the cursor will ever be used for WHERE OFFSET IN). > Yeah, it's only a few bytes per row, but that will add up in queries that > process lots of rows. Does PostgreSQL have SQL hints now? I.e. some kind of "pragma" parsed from SQL comments to indicate the subsequent usage pattern? Such a hint would allow using either storing the single row information for DELETE/UPDATE or the list. Dumping the list to a disk file will be added later so memory usage is not unbounded. I was just testing the waters for the idea. Best regards, Zoltán Böszörményi > > regards, tom lane