Thread: rfc - libpq extensions

rfc - libpq extensions

From
Iker Arizmendi
Date:
Currently libpq doesn't allow you to call PQsendQuery multiple
times in succession over an asynchronous connection. If libpq could
accept concurrent queries (with 1 or more SQL statements each) then
code functionally similiar to the following would come in handy (at
least to me :)

// hypothetical PGquery object maintains the state
// for a given query and allows for selective query
// cancellation
PGquery* pQuery1 = PQcreateQuery("SELECT...");
PGquery* pQuery2 = PQcreateQuery("UPDATE...");
PGquery* pQuery3 = PQcreateQuery("INSERT...");

PQconnExecute(pConn, pQuery1);
PQconnExecute(pConn, pQuery2);
PQconnExecute(pConn, pQuery3);

// event loop
while(1)
{
    poll(...)

    PQconsumeInput(pConn);

    // hypothetical PQqueryState
    if (PQqueryState(pQuery1) == PQ_QUERY_COMPLETE) {
        // process results
    }
    if (PQqueryState(pQuery2) == PQ_QUERY_COMPLETE) {
        // process results
    }
    if (PQqueryState(pQuery3) == PQ_QUERY_COMPLETE) {
        // process results
    }

}

As things stand now you have to wait for each query to finish before
issuing a new one. This poses some difficulties if you want to use
connection pooling in an event driven server (as I'm trying to do). In
particular, you have to perform your own queueing of queries while you
wait for a connection to become available. To deal with this issue I've
started work on some extensions to libpq to allow for both
multiple"in-flight" queries and support for connection pooling while
still providing support for the usual access techniques. I was hoping to
get some thoughts from folks on the list with regard to interest
in these features (and/or potential pitfalls).

Regards,
Iker



Re: rfc - libpq extensions

From
Neil Conway
Date:
On Fri, 2003-01-10 at 12:10, Iker Arizmendi wrote:
> As things stand now you have to wait for each query to finish before
> issuing a new one.

How exactly are you planning to fix this?

Cheers,

Neil
--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC




Re: rfc - libpq extensions

From
Csaba Nagy
Date:
AFAIK Using the same connection to do more queries at the same time is not
supported by the backend.
If you use transactions, it will only make your life harder to make sure the
transaction goes on on the same connection, and nobody else uses the same
connection meanwhile, messing with your ongoing transaction... unless the
postgres developers would be willing to implement connection independent
transactions (by the means of a transaction ID passed along with the query ?
this is DB sience fiction - might be good but it's not standard nor really
needed).

I assume your problems with connection pooling are the relatively big setup
time of a connection, and the possibility of running out of connections...
Now what you can do is using a smart connection pool which always keeps a
few spare connections, and if the nr. of spare connections gets below the
minimum allowed, opens new ones (before the application asks for it). This
way your application will always have a connection at hand immediately.
From time to time idle connections can be deleted to avoid keeping the nr.
of connections too high after peeks of load, while keeping the configured
minimum of idle connections.

AFAIK this is how apache HTTP connection pooling is working, and it works
well.

HTH,
Csaba.

-----Ursprungliche Nachricht-----
Von: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]Im Auftrag von Iker Arizmendi
Gesendet: Freitag, 10. Januar 2003 18:10
An: pgsql-general@postgresql.org
Betreff: [GENERAL] rfc - libpq extensions


Currently libpq doesn't allow you to call PQsendQuery multiple
times in succession over an asynchronous connection. If libpq could
accept concurrent queries (with 1 or more SQL statements each) then
code functionally similiar to the following would come in handy (at
least to me :)

// hypothetical PGquery object maintains the state
// for a given query and allows for selective query
// cancellation
PGquery* pQuery1 = PQcreateQuery("SELECT...");
PGquery* pQuery2 = PQcreateQuery("UPDATE...");
PGquery* pQuery3 = PQcreateQuery("INSERT...");

PQconnExecute(pConn, pQuery1);
PQconnExecute(pConn, pQuery2);
PQconnExecute(pConn, pQuery3);

// event loop
while(1)
{
    poll(...)

    PQconsumeInput(pConn);

    // hypothetical PQqueryState
    if (PQqueryState(pQuery1) == PQ_QUERY_COMPLETE) {
        // process results
    }
    if (PQqueryState(pQuery2) == PQ_QUERY_COMPLETE) {
        // process results
    }
    if (PQqueryState(pQuery3) == PQ_QUERY_COMPLETE) {
        // process results
    }

}

As things stand now you have to wait for each query to finish before
issuing a new one. This poses some difficulties if you want to use
connection pooling in an event driven server (as I'm trying to do). In
particular, you have to perform your own queueing of queries while you
wait for a connection to become available. To deal with this issue I've
started work on some extensions to libpq to allow for both
multiple"in-flight" queries and support for connection pooling while
still providing support for the usual access techniques. I was hoping to
get some thoughts from folks on the list with regard to interest
in these features (and/or potential pitfalls).

Regards,
Iker



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: rfc - libpq extensions

From
Iker Arizmendi
Date:
First, consider a motivating example. A single-threaded event-driven
application server serves two concurrent clients, client-1 and client-2,
using only a single "pooled" connection.

client-1 and client-2 connect to the application server.

client-1 issues an app specific request to the app server. The app
server, which is sitting in an event loop, services client-1 by
acquiring the single available connection and issuing an asynchronous
query. The server then registers the database connection's FD for
I/O events and returns to the event loop.

client-2 issues a request. The app server wakes from the event loop and
tries to service client-2. At this point, the server cannot acquire the
database connection because it's in use (and it can't issue the query
because libpq doesn't support queued queries). Simply creating a new
database connection doesn't solve the problem since we may wish to
support a very large number of concurrent clients (which is the reason
we're using an event-driven server and connection pooling).

To get around this I thought the following extension API (implemented
using the libpq API) would help. Here's a highly simplified pseudo-code
example with error handling omitted:

==========

// Create a cache with a maximum of 1 "hard" connections
PGXconnCache* pCache = PQXcreateConnCache(1);

// Create two "soft" (or virtual) connections and associate them
// with the connection cache from which they will draw "hard"
// connections when necessary.
//

PGXconn* pConn1 = PQXconnect(pCache, connString);
PGXconn* pConn2 = PQXconnect(pCache, connString);

// Create 2 queries. Issue one on each connection.
//
// Upon calling PQXconnExecute, pConn1 goes to the "hard"
// connection cache and obtains the single available
// connection and asynchronously issues the query on it.
// The descriptor returned for PQXconnFd and registered
// with our event handling method is the descriptor
// of the underlying "hard" connection's socket.

PGXquery* pQuery1 = PQXcreateQuery(queryString1);
PQXconnExecute(pConn1, pQuery1);
register_event(PQXconnFd(pConn1), EV_READ, event_handler, pConn1);

// At this point, the single connection is in use. When
// pConn2 goes to the cache it will find it empty. So
// pConn2 queues its query and registers itself with
// the cache for a notification as soon as a "hard"
// connection becomes available.
//
// In this case, the descriptor returned by PQXconnFd is
// the descriptor for the read end of a pipe. The write
// end is what pConn2 registered with the cache. As soon
// as pConn1 returns its "hard" conn to the cache, the
// cache will write a single byte to pConn2's notification
// pipe. From an application developer's perspective, however
// this detail is transparent.

PGXquery* pQuery2 = PQXcreateQuery(queryString2);
PQXconnExecute(pConn2, pQuery2);
register_event(PQXconnFd(pConn2), EV_READ, event_handler, pConn2);

while(1)
{
    event_dispatch();
    if (PQXgetQueryState(pQuery1) == PGX_QUERY_COMPLETE))
    {
        // process results
    }
    if (PQXgetQueryState(pQuery2) == PGX_QUERY_COMPLETE))
    {
        // process results
    }

}

==========

void
event_handler(fd, void* arg)
{
    PGXconn* pConn = (PGXconn*)(arg);
    // For pConn1, the call to connEvent will result in
    // processing of events coming over the database socket.
    // Once pConn1 is done processing its current query, it
    // will return its "hard" connection to the cache. The
    // cache will then notify the waiting pConn2 pipe (by
    // writing to pConn2's pipe).
    // When the thread of control returns to the event loop
    // it will pick that event up and we'll end up back here
    // but with an event for pConn2.
    // pConn2 can then go to the cache, get the newly
    // available "hard" connection and asynchronously issue
    // its query.
    PQXconnEvent(pConn);
}

==========

// Here's a preliminary list of methods that I've started
// to implement. The descriptions/semantics are preliminary, too.

/**
 * Create hard connection cache with a maximum of maxCn
 * hard connections for any given connection string. Returns
 * a valid pointer on success, 0 otherwise.
 *
 */
extern PGXconnCache* PQXcreateConnCache(unsigned int maxCn);

/**
 * Make hard connection allocations synchronous. Subsequent
 * operations on this cache will be thread safe (TBD ?)
 *
 */
extern void PQXsetConnCacheSync(PGXconnCache* pCC);

/**
 * Release hard connection cache pCC. Any soft connections
 * are closed and their respective pending queries are cancelled.
 *
 */
extern void PQXfreeConnCache(PGXconnCache* pCC);

/**
 * Create a soft connection associated with the hard
 * connection cache pCC. The format of the connection
 * string is as described in PQconnectdb().
 *
 */
extern PGXconn* PQXconnect(PGXconnCache* pCC, const char* conninfo);

/**
 * Returns the event descriptor for the given soft connection
 * suitable for use with poll() or select().
 *
 */
extern int PQXconnFd(const PGXconn* pCn);

/**
 * Use soft connection pCn to execute the query pQ. If the
 * soft connection is associated with an actual connection
 * then query goes out immediately, otherwise it gets
 * queued for later execution. Returns 1 if query was issued,
 * 0 if it was queued, and -1 on error.
 *
 */
extern int PQXconnExecute(PGXconn* pCn, PGXquery* pQ);

/**
 * Call this method whenever there's activity on
 * the descriptor for this soft connection.
 *
 */
extern int PQXconnEvent(PGXconn* pCn);

/**
 * Closes the soft connection pCn. Any held hard connections
 * are returned to the cache from which the soft connection
 * sprang.
 *
 */
extern void PQXconnClose(PGXconn* pCn);

/**
 * Releases all resources associated with pCn. If the connection
 * is open, it is closed before being freed.
 *
 */
extern void PQXconnFree(PGXconn* pCn);

/**
 * Returns the soft connection state. If you're asynchronously
 * connecting you should register pCn's event descriptor for
 * writing when this method returns PGX_CONN_CONNECTING. Note
 * that you should always check this method after calling
 * PQXconnect since connections can sometime happen synchronously
 * when the TCP peer is local (this will save you from having
 * to register the FD for write events).
 */
extern PGXconnStateEnum PQXconnState(PGXconn* pCn);

/**
 * Returns the error message associated with a soft connection,
 * if any.
 *
 */
extern const char* PQXconnError(PGXconn* pCn);

/**
 * Create a SQL query from the given text. The
 * text is NOT checked for syntax locally but on the
 * backend server.
 *
 */
extern PGXquery* PQXcreateQuery(const char* pText);

/**
 * Releases the given query. If the query is pending
 * execution it is cancelled. If it is currently executing
 * a cancel request is issued (and the response is
 * ignored).
 *
 */
extern void PQXfreeQuery(PGXquery* pQ);

/**
 * Cancels the query - if the query is pending execution
 * then its cancelled immediately. If it's been shipped
 * off then the query state becomes CANCEL_PENDING.
 *
 * Returns 1 on success, 0 on failure.
 */
extern int PQXcancelQuery(PGXquery* pQ);

/**
 * Get a query's result object (if any)
 *
 */
extern PGresult* PQXgetQueryResult(PGXquery* pQ);

/**
 * Returns the state of the given query.
 *
 */
extern PGXqueryStateEnum PQXgetQueryState(PGXquery* pQ);

/**
 * Sets the text for query pQ provided pQ is not currently
 * executing. The query buffer for pQ is reused or expanded
 * to accomodate pText as necessary. Returns 1 on success,
 * 0 otherwise.
 *
 */
extern int PQXsetQueryText(PGXquery* pQ, const char* pText);


On Fri, 10 Jan 2003 17:15:39 +0000 (UTC)
neilc@samurai.com (Neil Conway) wrote:

> On Fri, 2003-01-10 at 12:10, Iker Arizmendi wrote:
> > As things stand now you have to wait for each query to finish before
> > issuing a new one.
>
> How exactly are you planning to fix this?
>
> Cheers,
>
> Neil
> --
> Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
>
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 5: Have you checked our
> extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

Re: rfc - libpq extensions

From
Iker Arizmendi
Date:
On Fri, 10 Jan 2003 18:37:52 +0100
Csaba Nagy <nagy@domeus.de> wrote:

> AFAIK Using the same connection to do more queries at the same time is
> not supported by the backend.
> If you use transactions, it will only make your life harder to make
> sure the transaction goes on on the same connection, and nobody else
> uses the same connection meanwhile, messing with your ongoing
> transaction... unless the postgres developers would be willing to
> implement connection independent transactions (by the means of a
> transaction ID passed along with the query ? this is DB sience fiction
> - might be good but it's not standard nor really needed).
>

By using something like virtual (or "soft") connections (see my
previous post) it's relatively straight forward to make sure that
several queries that form a single transaction get the same DB
connection. Once a transaction is started on a virtual connection
and the VC gets a hold of a database connection it can
simply hold on to it until the transaction is finalized. If another
virtual connection is waiting for a DB connection it will just
queue its queries and wait for one to become available. Once it gets
it, it will hold it for the duration of one query (for non TX queries)
or several queries (eg, "BEGIN", "INSERT..", "UPDATE..", "COMMIT").

> I assume your problems with connection pooling are the relatively big
> setup time of a connection, and the possibility of running out of
> connections...

I think the more pressing problem for the connection-pooled,
event-driven server is that it cannot block to wait for a connection
(for whatever reason). At the same time it's pretty cumbersome for an
application developer to have to manually queue queries during the
wait.

Regards,
Iker