Thread: [PATCH] Add UPDATE WHERE OFFSET IN clause

[PATCH] Add UPDATE WHERE OFFSET IN clause

From
Böszörményi Zoltán
Date:
Hi!

A couple of years ago I was working on PostgreSQL
features and extensions. Some of them was in ECPG,
mainly in the area of improving Informix compatibility.

One of the patchsets I invented was the cursor readahead
support to improve performance in ECPG which used FETCH N.
This work was published at
https://github.com/zboszor/ecpg-readahead/commits

However, there was one use case for this patchset actually
decreased performance, namely if the cursor was used to
modify rows via UPDATE WHERE CURRENT OF. It was because
the cursor position on the server side was different from
the application so the ECPG readahead code had to use
MOVE then UPDATE WHERE CURRENT OF.

I brewed this idea for a while and yesterday I decided
to do something about it and here's the (admittedly, limited)
result. I added a new PostgreSQL extension to the UPDATE syntax:

UPDATE ... WHERE OFFSET n IN cursor;

This new syntax changes the feature disparity between
FETCH N and UPDATE WHERE CURRENT OF that existed in
PostgreSQL forever.

I only implemented this for cursors that use SELECT FOR UPDATE.
This part was quite straightforward to implement.

The behaviour of this syntax is as follows:
* the offset value may be 0 or negative, since it is a virtual
   index into the rows returned by the last FETCH statement
   and negative indexes are felt natural relative to the
   cursor position and the cursor direction
* for offset value 0, the behaviour is identical to WHERE CURRENT OF
* negative indexes allow UPDATEs on previous rows even if
   the cursor reached the end of the result set

I need clues for how to extend this for cursors that aren't
using FOR UPDATE queries, if it's possible at all.

Please, review.

Best regards,
Zoltán Böszörményi

Attachment

Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

From
Chapman Flack
Date:
Hi!

On 02/07/22 00:59, Böszörményi Zoltán wrote:
> UPDATE ... WHERE OFFSET n IN cursor;

If added to UPDATE, should this be added to DELETE also?

Regards,
-Chap



Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> On 02/07/22 00:59, Böszörményi Zoltán wrote:
>> UPDATE ... WHERE OFFSET n IN cursor;

> If added to UPDATE, should this be added to DELETE also?

FWIW, I think this is a really horrid hack.  For one thing, it's not
robust against not-strictly-linear FETCH/MOVE of the cursor.  It looks
to me like "OFFSET n" really means "the row that we read N reads ago",
not "the row N before the current cursor position".  I see that the
documentation does explain it that way, but it fails to account for
optimizations such as whether we implement moves by reading backwards
or rewind-and-read-forwards.  I don't think we want to expose that
sort of implementation detail.

I'm also pretty displeased with causing unbounded memory consumption for
every use of nodeLockRows, whether it has anything to do with a cursor or
not (never mind whether the cursor will ever be used for WHERE OFFSET IN).
Yeah, it's only a few bytes per row, but that will add up in queries that
process lots of rows.

            regards, tom lane



Re: [PATCH] Add UPDATE WHERE OFFSET IN clause

From
Böszörményi Zoltán
Date:
2022. 02. 08. 2:05 keltezéssel, Tom Lane írta:
> Chapman Flack <chap@anastigmatix.net> writes:
>> On 02/07/22 00:59, Böszörményi Zoltán wrote:
>>> UPDATE ... WHERE OFFSET n IN cursor;
> 
>> If added to UPDATE, should this be added to DELETE also?

Yes, it should be added, too.

> FWIW, I think this is a really horrid hack.

Thanks for your kind words. :-D

> For one thing, it's not
> robust against not-strictly-linear FETCH/MOVE of the cursor.  It looks
> to me like "OFFSET n" really means "the row that we read N reads ago",
> not "the row N before the current cursor position".  I see that the
> documentation does explain it that way, but it fails to account for
> optimizations such as whether we implement moves by reading backwards
> or rewind-and-read-forwards.  I don't think we want to expose that
> sort of implementation detail.
> 
> I'm also pretty displeased with causing unbounded memory consumption for
> every use of nodeLockRows, whether it has anything to do with a cursor or
> not (never mind whether the cursor will ever be used for WHERE OFFSET IN).
> Yeah, it's only a few bytes per row, but that will add up in queries that
> process lots of rows.

Does PostgreSQL have SQL hints now? I.e. some kind of "pragma"
parsed from SQL comments to indicate the subsequent usage pattern?
Such a hint would allow using either storing the single row information
for DELETE/UPDATE or the list.

Dumping the list to a disk file will be added later so memory
usage is not unbounded.

I was just testing the waters for the idea.

Best regards,
Zoltán Böszörményi


> 
>             regards, tom lane