Home > mailing lists

Re: Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers

From	Marko Kreen
Subject	Re: Speed dblink using alternate libpq tuple storage
Date	April 3, 2012 18:48:04
Msg-id	20120403184740.GA6737@gmail.com Whole thread Raw
In response to	Re: Speed dblink using alternate libpq tuple storage (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Speed dblink using alternate libpq tuple storage (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Sun, Apr 01, 2012 at 07:23:06PM -0400, Tom Lane wrote:
> Marko Kreen <markokr@gmail.com> writes:
> > So the proper approach would be to have new API call, designed to
> > handle it, and allow early-exit only from there.
> 
> > That would also avoid any breakage of old APIs.  Also it would avoid
> > any accidental data loss, if the user code does not have exactly
> > right sequence of calls.
> 
> > How about PQisBusy2(), which returns '2' when early-exit is requested?
> > Please suggest something better...
> 
> My proposal is way better than that.  You apparently aren't absorbing my
> point, which is that making this behavior unusable with every existing
> API (whether intentionally or by oversight) isn't an improvement.
> The row processor needs to be able to do this *without* assuming a
> particular usage style,

I don't get what kind of usage scenario you think of when you 
"early-exit without assuming anything about upper-level usage."

Could you show example code where it is useful?

The fact remains that upper-level code must cooperate with callback.
Why is it useful to hijack PQgetResult() to do so?  Especially as the
PGresult it returns is not useful in any way and the callback still needs
to use side channel to pass actual values to upper level.

IMHO it's much better to remove the concept of early-exit from public
API completely and instead give "get" style API that does the early-exit
internally.  See below for example.

> and most particularly it should not force people
> to use async mode.

Seems our concept of "async mode" is different.  I associate
PQisnonblocking() with it.  And old code, eg. PQrecvRow()
works fine in both modes.

> An alternative that I'd prefer to that one is to get rid of the
> suspension return mode altogether.

That might be good idea.  I would prefer to postpone controversial
features instead having hurried design for them.

Also - is there really need to make callback API ready for *all*
potential usage scenarios?  IMHO it would be better to make sure
it's good enough that higher-level and easier to use APIs can
be built on top of it.  And thats it.

> However, that leaves us needing
> to document what it means to longjmp out of a row processor without
> having any comparable API concept, so I don't really find it better.

Why do you think any new API concept is needed?  Why is following rule
not enough (for sync connection):
 "After exception you need to call PQgetResult() / PQfinish()".

Only thing that needs fixing is storing lastResult under PGconn
to support exceptions for PQexec().  (If we need to support
exceptions everywhere.)

---------------

Again, note that if we would provide PQrecvRow()-style API, then we *don't*
need early-exit in callback API.  Nor exceptions...

If current PQrecvRow() / PQgetRow() are too bloated for you, then how about
thin wrapper around PQisBusy():

/* 0 - need more data, 1 - have result, 2 - have row */
int PQhasRowOrResult(PGconn *conn, PGresult **hdr, PGrowValue **cols)
{   int gotrow = 0;   PQrowProcessor oldproc;   void *oldarg;   int busy;
   /* call PQisBusy with our callback */   oldproc = PQgetRowProcessor(conn, &oldarg);   PQsetRowProcessor(conn,
hasrow_cb,&flag);   busy = PQisBusy(conn);   PQsetRowProcessor(conn, oldproc, oldarg);

   if (gotrow)   {       *hdr = conn->result;       *cols = conn->rowBuf;       return 2;   }   return busy ? 0 : 1;
}

static int hasrow_cb(PGresult *res, PGrowValue *columns, void *param)
{   int *gotrow = param;   *gotrow = 1;   return 0;
}

Also, instead hasrow_cb(), we could have integer flag under PGconn that
getAnotherTuple() checks, thus getting rid if it even in internal
callback API.

Yes, it requires async-style usage pattern, but works for both
sync and async connections.  And it can be used to build even
simpler API on top of it.

Summary: it would be better to keep "early-exit" internal detail,
as the actual feature needed is processing rows outside of callback.
So why not provide API for it?

-- 
marko

pgsql-hackers by date:

From: Peter Geoghegan
Date: 03 April 2012, 18:23:27
Subject: Re: invalid search_path complaints

From: Robert Haas
Date: 03 April 2012, 18:58:44
Subject: Re: invalid search_path complaints

Re: Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers

Previous

Next