Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization - Mailing list pgsql-hackers

From Leon Winter
Subject Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
Date
Msg-id 20171020075738.v2syffhsp2di6khp@bfw-online.de
Whole thread Raw
In response to Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
List pgsql-hackers
> > The calculations inside the loop are written in some dynamic high-level
> > language and cannot easily be translated into SQL.
> 
> You don't really have to --- PG supports functions written in non-SQL
> languages.  Not sure if your problem is big enough to justify developing
> a new PL interface for $random-4GL-language, but that might be something
> to consider.

Surely it would be a possibilty to loop in SQL and call the other language for
each row, but I am not sure Postgres would be too happy if the callee waits for
user input and/or wants to do some SQL update operations.

> But, to get back to the original point: exactly what "sizeable performance
> impact of declaring a cursor "with hold"" do you think there is?  It
> shouldn't be noticeably more expensive than just selecting the rows would
> be.  Especially not for a number of rows that wouldn't make the
> above-depicted application structure completely untenable.  And for sure
> I'm not understanding why you think that materializing the result on the
> client side instead would be better.

For small tables materialization is a non-issue but we have large table where a
single select statement over all the rows causes Postgres to create 8GB temp
files being busy multiple seconds which is very noticeable.

> > Naturally I am now wondering why the postgres cursor/portal is not also
> > employing the same trick (at least as an option): Postpone
> > materialization of "with hold" cursors until it is required (like a
> > commit operation is dispatched).
> 
> We already do that, and have done it since the feature was invented,
> AFAIR.

When we declare a cursor for a select on the mentioned big table, it takes
multiple seconds and a big temp file is created which to me seems like the
materialization took place immediately.

> FWIW, the primary reason for materializing the cursor contents at commit,
> rather than just holding onto the active query as you seem to think would
> be better, is that materializing allows us to release locks on the
> underlying table(s).  If we kept the active query we'd have to keep those
> locks, as well as the query's active snapshot, thus certainly blocking
> VACUUM cleanup, and possibly blocking subsequent DDL.

Of course, providing such a long-lived cursor would be more costly in terms of
the resources you describe, but currently the cost of instant-materialization at
the opening operation of the cursor is more expensive.

> The approach of using a separate connection to read the cursor suffers
> from exactly those same problems.  Postgres isn't that happy with very
> long-lived transactions (neither is any other RDBMS I'm familiar with).
> So really I think that materializing the cursor right away and then
> doing your application calculations in whatever chunk size seems
> convenient is probably your best bet here.

In fact we are fetching cursor results in bigger chunks to avoid communication
overhead and many cursors reach end of scan immediately but there are cursors on
big tables and fetching all the records of them immediately is very expensive.
Meanwhile I have implemented a "lazy hold" in my database abstraction layer
which pulls the records in at commit which is a okayish trade-off. Better yet,
but with the disadvantages you outlined, would be a "with hold" cursor that
would hold onto the locks and the snapshot which would avoid materialization
entirely (which can be emulated with second database connection).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Leon Winter
Date:
Subject: Re: [HACKERS] Cursor With_Hold Performance Workarounds/Optimization
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] asynchronous execution