Thread: [RFC] libpq extensions - followup

[RFC] libpq extensions - followup

From
Iker Arizmendi
Date:
Hello everyone,

A while back I suggested that it would be useful if asynchronous
connections could support multiple queries "in flight", especially if
you're working on event driven applications (see topic "rfc - libpq
extensions"). I've since started work on a simple library, which I
call libpqx, that I think could potentially provide a nice alternative
to libpq. I was hoping the folks on the list could discuss whether such
a library would be useful and if so, provide some feedback regarding its
design. The library code is very much in its infancy but I believe it
provides a reasonable foundation that can be easily extended. The
library source tarball can be found on my home page:

http://www.arizmendi.org

Please keep in mind that the current code is only meant to demonstrate
the feasibility of its design - it's still full of "TODO"s, "assert"s
and "abort"s.

Cheers,
Iker

DESIGN CRITERIA:
===========

Here are some of the things I thought the library should do, and that
I tried to address:

1) Support for event driven and threaded programs.

2) Multiple "in flight" queries on asynchronous connections must be
permitted. For event driven applications this is key.

3) Support connection pooling for event-driven and threaded programs.
After getting started on (2), I realized this was also necessary. One of
the reasons to write event driven servers is to support a larger number
of clients than you otherwise would with a process-per-client or
thread-per-client architecture. Without connection pooling an event
driven architecture doesn't help - it only moves the problem to the DB
server (which will spawn a process for each connection).

4) Easy to use interface. I used libpq as a starting model but made
several changes and added two new abstractions - connection pool
(PGXpool) and query (PGXquery). The latter is necessary to support (2)
and it also provides a foundation for parameterized queries (eg, "DELETE
FROM TableA WHERE x= ?"). It might also serve to hide the details of
bytea encoding.

5) Thread safe. For event driven applications this isn't an issue
(except on SMP machines - this needs to be addressed). In threaded
programs, multiple threads can share a common connection pool (although
they must each have their own PGXconn object).

IMPLEMENTATION:
===========

Most of the coding is pretty straight-forward (at least I tried to make
it so) but there are two implementation details that should be noted:

1) Connection pooling and event driven servers.
When implementing an event driven server on *NIX you typically have a
poll/select event loop that listens for activity on a range of file
descriptors. Without connection pooling one establishes a connection to
the backend and uses the socket descriptor to listen on. However, in a
pooled implementation a connection may not actually be backed by a
socket until the pool connection count goes positive. In this case the
client will have nothing to register in its event loop. To deal with
this issue, the PGXconn object uses a pipe to the connection pool to
receive availability notifications - an "eventable" descriptor is thus
always available to give to clients.

2) Explicit state machine implementation.
The PGXconn object relies on the use of explicit machine states to
get all its work done (synchronous operations get there own special
state). Each machine states implements a common set of methods
(defined in a struct of function pointers) in a separate file which
keeps the code layout clean. At the moment I implemented states that
mimic the way libpq does its thing but I think there's potential to
cleanly provide additional functionality (non-blocking query writes, for
instance) by simply adding new states. P.S. My implementation of a state
machine is probably pretty naive since I only recently started using
them aggressively - any suggestions here would be much appreciated.

3) Wrapping of libpq.
During this first cut, I leveraged libpq as much as possible but
in the future it may be worth while for libpqx to wean itself off
the libpq API and instead use more of its underlying code.



Re: [RFC] libpq extensions - followup

From
Tom Lane
Date:
Iker Arizmendi <iker@research.att.com> writes:
> A while back I suggested that it would be useful if asynchronous
> connections could support multiple queries "in flight", especially if
> you're working on event driven applications (see topic "rfc - libpq
> extensions"). I've since started work on a simple library,

I must be missing something fundamental here.  The backend doesn't
support multiple parallel queries, so how can you have "multiple queries
in flight" on the same connection?

            regards, tom lane

Re: [RFC] libpq extensions - followup

From
Iker Arizmendi
Date:
Althought multiple queries can't be executed concurrently they can be
queued up by a connection object. For instance:

PGXquery* q1 = PQXcreateQuery(...);
PGXquery* q2 = PQXcreateQuery(...);
...
PGXquery* qN = PQXcreateQuery(...);

PQXexecute(conn1, q1);
PQXexecute(conn1, q2);
...
PQXexecute(conn1, qN);

while (1) { /* event loop code */ }

In this example, queries q1 through qN are "in flight" simultaneously as
far as the client is concerned even though the connection conn1 is
really queueing them up. During execution of the event loop, the
connection object manages the job of executing each query
sequentially (which, with respect to the client, is an implementation
detail).

Cheers,
Iker

On Sun, 26 Jan 2003 00:47:20 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Iker Arizmendi <iker@research.att.com> writes:
> > A while back I suggested that it would be useful if asynchronous
> > connections could support multiple queries "in flight", especially
> > if you're working on event driven applications (see topic "rfc -
> > libpq extensions"). I've since started work on a simple library,
>
> I must be missing something fundamental here.  The backend doesn't
> support multiple parallel queries, so how can you have "multiple
> queries in flight" on the same connection?
>
>             regards, tom lane


Re: [RFC] libpq extensions - followup

From
Iker Arizmendi
Date:
BTW, the source tarball contains an example program that shows
how multiple "in flight" queries are simulated. It also contains a
multi-threaded example (where each thread waits on the connection
pool).

Cheers,
Iker


On Sun, 26 Jan 2003 00:47:20 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Iker Arizmendi <iker@research.att.com> writes:
> > A while back I suggested that it would be useful if asynchronous
> > connections could support multiple queries "in flight", especially
> > if you're working on event driven applications (see topic "rfc -
> > libpq extensions"). I've since started work on a simple library,
>
> I must be missing something fundamental here.  The backend doesn't
> support multiple parallel queries, so how can you have "multiple
> queries in flight" on the same connection?
>
>             regards, tom lane


--
Iker Arizmendi
AT&T Labs - Research
Speech and Image Processing Lab
e: iker@research.att.com
w: http://research.att.com
p: 973-360-8516


Re: [RFC] libpq extensions - followup

From
Tom Lane
Date:
Iker Arizmendi <iker@research.att.com> writes:
> Althought multiple queries can't be executed concurrently they can be
> queued up by a connection object.

Hmm.  This seems to presume that every query will be executed as a
separate transaction.  That's pretty limiting (unless you suppose all
the queries are read-only).

            regards, tom lane

Re: [RFC] libpq extensions - followup

From
Iker Arizmendi
Date:
> Hmm.  This seems to presume that every query will be executed as a
> separate transaction.  That's pretty limiting (unless you suppose all
> the queries are read-only).

True, but only at this early stage. There are (at least) two
ways which can be used to provide transaction support:

1) Explicit transaction start/end methods. Eg.

PQXbeginTx(pConn1);
PQXexecute(pConn1, q1);
PQXexecute(pConn1, q2);
PQXexecute(pConn1, q3);
PQXcommitTx(pConn1);

2) Or implicit transaction support. Eg.

PGXquery* q1 = PQXcreateQuery("BEGIN... UPDATE...");
PGXquery* q2 = PQXcreateQuery("UPDATE");
PGXquery* q3 = PQXcreateQuery("INSERT");
PGXquery* q4 = PQXcreateQuery("COMMIT");

In this latter case the PGXconn object would have to inspect the
SQL queries in search of transaction delimiters.

As far as uncommitted transactions go, the PGXconn object has a flag
that indicates whether its currently in a transaction - this flag can
be checked (it currently isn't) upon transitioning to the closed state.
At that point whatever policy governing uncommitted transactions can be
put in place before returning the connection to the pool.

Cheers,
Iker


On Sun, 26 Jan 2003 13:38:02 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Iker Arizmendi <iker@research.att.com> writes:
> > Althought multiple queries can't be executed concurrently they can
> > be queued up by a connection object.
>
> Hmm.  This seems to presume that every query will be executed as a
> separate transaction.  That's pretty limiting (unless you suppose all
> the queries are read-only).
>
>             regards, tom lane


--
Iker Arizmendi
AT&T Labs - Research
Speech and Image Processing Lab
e: iker@research.att.com
w: http://research.att.com
p: 973-360-8516


Re: [RFC] libpq extensions - followup

From
Iker
Date:
BTW, the source tarball contains two example programs - an event-driven
version which demonstrates simulating multiple "in-flight" queries and a
multi-threaded version.

Cheers,
Iker


----- Original Message -----
From: "Iker Arizmendi" <iker@research.att.com>
Newsgroups:
comp.databases.postgresql.general,comp.databases.postgresql.questions
Sent: Sunday, January 26, 2003 2:02 AM
Subject: Re: [RFC] libpq extensions - followup


> Althought multiple queries can't be executed concurrently they can be
> queued up by a connection object. For instance:
>
> PGXquery* q1 = PQXcreateQuery(...);
> PGXquery* q2 = PQXcreateQuery(...);
> ...
> PGXquery* qN = PQXcreateQuery(...);
>
> PQXexecute(conn1, q1);
> PQXexecute(conn1, q2);
> ...
> PQXexecute(conn1, qN);
>
> while (1) { /* event loop code */ }
>
> In this example, queries q1 through qN are "in flight" simultaneously as
> far as the client is concerned even though the connection conn1 is
> really queueing them up. During execution of the event loop, the
> connection object manages the job of executing each query
> sequentially (which, with respect to the client, is an implementation
> detail).
>
> Cheers,
> Iker
>
> On Sun, 26 Jan 2003 00:47:20 -0500
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Iker Arizmendi <iker@research.att.com> writes:
> > > A while back I suggested that it would be useful if asynchronous
> > > connections could support multiple queries "in flight", especially
> > > if you're working on event driven applications (see topic "rfc -
> > > libpq extensions"). I've since started work on a simple library,
> >
> > I must be missing something fundamental here.  The backend doesn't
> > support multiple parallel queries, so how can you have "multiple
> > queries in flight" on the same connection?
> >
> > regards, tom lane
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)