Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
Date
Msg-id 31983.1508424810@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization  (Leon Winter <winter-pg@bfw-online.de>)
Responses Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
List pgsql-hackers
Leon Winter <winter-pg@bfw-online.de> writes:
> The loops are more complex in reality of course, more like:

> open cursor for select from table1
> loop
> { fetch some entries from cursor
>   call some external application
>   do some crazy complicated calculations based on some user input in the UI *
>   update table2
>   commit
> }

Hm, somehow it's pretty hard to credit that the materialized cursor
is the principal performance bottleneck in this configuration.

> The calculations inside the loop are written in some dynamic high-level
> language and cannot easily be translated into SQL.

You don't really have to --- PG supports functions written in non-SQL
languages.  Not sure if your problem is big enough to justify developing
a new PL interface for $random-4GL-language, but that might be something
to consider.

But, to get back to the original point: exactly what "sizeable performance
impact of declaring a cursor "with hold"" do you think there is?  It
shouldn't be noticeably more expensive than just selecting the rows would
be.  Especially not for a number of rows that wouldn't make the
above-depicted application structure completely untenable.  And for sure
I'm not understanding why you think that materializing the result on the
client side instead would be better.

> Naturally I am now wondering why the postgres cursor/portal is not also
> employing the same trick (at least as an option): Postpone
> materialization of "with hold" cursors until it is required (like a
> commit operation is dispatched).

We already do that, and have done it since the feature was invented,
AFAIR.

FWIW, the primary reason for materializing the cursor contents at commit,
rather than just holding onto the active query as you seem to think would
be better, is that materializing allows us to release locks on the
underlying table(s).  If we kept the active query we'd have to keep those
locks, as well as the query's active snapshot, thus certainly blocking
VACUUM cleanup, and possibly blocking subsequent DDL.

The approach of using a separate connection to read the cursor suffers
from exactly those same problems.  Postgres isn't that happy with very
long-lived transactions (neither is any other RDBMS I'm familiar with).
So really I think that materializing the cursor right away and then
doing your application calculations in whatever chunk size seems
convenient is probably your best bet here.
        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Geoff Winkless
Date:
Subject: Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Supporting Windows SChannel as OpenSSL replacement