Thread: Practical Cursors

Practical Cursors

From
"Command Prompt, Inc."
Date:
Hello,

We are just about to wrap up our book, Pratical PostgreSQL (the one that
used to be The Elephant never forgets) and we have a question.

There is an obvious benefit to the use of cursors within a persistent
environment. In other words, if my connection to the database is live, I
can increase my query and display efficiency through the use of a cursor.

However, this seems to be useless within a web based environment. If we
have a live connection through a C++ application, we can perform a
transaction and interact within the results.

However, using something like PHP will not allow this because HTTP is
stateless and PostgreSQL will not know from one transaction to the next
that the results of the connection are related.

Is this truly the case, or is there a way for PostgreSQL to remember the
connection identifier so that the next time a PHP connection is made with
the same identifier a transaction can be completed?

Sincerely,

Joshua Drake



--
Practical PostgreSQL:
 http://stage.linuxports.com/projects/postgres/book1.htm
by way of:
 pgsql-general@commandprompt.com



Re: Practical Cursors

From
"Joe Conway"
Date:
> However, using something like PHP will not allow this because HTTP is
> stateless and PostgreSQL will not know from one transaction to the next
> that the results of the connection are related.
>
> Is this truly the case, or is there a way for PostgreSQL to remember the
> connection identifier so that the next time a PHP connection is made with
> the same identifier a transaction can be completed?
>
> Sincerely,
>
> Joshua Drake
>

I have actually been thinking recently about just this question. I looked
through the source for the PHP pgsql extension and it is clear that no
cursor is used. There is, however, the ability to open a persistent
connection to PostgreSQL from PHP. I think it would be possible to create a
persistent resource identifier to a cursor in a similar manner, and "reuse"
the cursor across multiple HTTP requests.

The downside I can see is that PHP would have no way to know when it could
garbage-collect the allocated resource. So while it could be done, what
would happen when the user closes their browser with your cursor still open?

-- Joe


Re: Practical Cursors

From
"Command Prompt, Inc."
Date:
> The downside I can see is that PHP would have no way to know when it could
> garbage-collect the allocated resource. So while it could be done, what
> would happen when the user closes their browser with your cursor still open?

Well, I thought of two things. One PostgreSQL could have some identifier
that was set by PHP, similar to a session ID that would have a time out.
Like a cookie timeout. If PostgreSQL did not receive a CLOSE or COMMIT
within that time limit, it would drop the transaction.

On the other hand, you could have it track the transaction so that the
next time I come back, if my identifier it matched to my session you could
code an application that will inform you of the transactions that you that
are not completed. Alas, that would be cool but would take a lot of code
on the PHP end.

The second thought I had was a small java applet that kept a live
connection to PostgreSQL and proxied certain things but that seems
extreme.

J

>
> -- Joe
>

--
--
by way of pgsql-general@commandprompt.com


Re: Practical Cursors

From
Alex Pilosov
Date:
On Mon, 17 Sep 2001, Command Prompt, Inc. wrote:

> However, using something like PHP will not allow this because HTTP is
> stateless and PostgreSQL will not know from one transaction to the next
> that the results of the connection are related.
>
> Is this truly the case, or is there a way for PostgreSQL to remember the
> connection identifier so that the next time a PHP connection is made with
> the same identifier a transaction can be completed?
Not currently. Cursor right now is a per-backend thing. A lot of changes
are needed to make cursors global. (particularly transaction-awareness)

-alex


Re: Practical Cursors

From
Sam Tregar
Date:
On Mon, 17 Sep 2001, Command Prompt, Inc. wrote:

> Is this truly the case, or is there a way for PostgreSQL to remember the
> connection identifier so that the next time a PHP connection is made with
> the same identifier a transaction can be completed?

Anything is possible - you could write a server process that associates
cursors with session IDs and then have your PHP app make requests through
the server.  Would it be a good idea?  Probably not.  Would it qualify as
"practical postgres" usage?  Definitely not!  I'd call it an impractical
idea that might be possible through a lot of hard work.

-sam


Re: Practical Cursors

From
Micah Yoder
Date:
(sorry to reply to a week-old message.  need to keep up with this list more!)

On Monday 17 September 2001 17:04, you wrote:

> There is an obvious benefit to the use of cursors within a persistent
> environment. In other words, if my connection to the database is live, I
> can increase my query and display efficiency through the use of a cursor.
>
> However, this seems to be useless within a web based environment. If we
> have a live connection through a C++ application, we can perform a
> transaction and interact within the results.

Yep.  That seems to be a disadvantage with ALL database systems & HTTP based
apps.

I once wrote a MySQL app (I know, but that's what the company used) to do a
fairly complicated search on a huge database full of domain names.  The query
was time consuming (10-30 seconds) so it obviously could not be performed for
every prev/next page request.

My first approach was to have the PHP script write the entire data resultset
to a fixed-length file, which could be easily accessed for each request to
the point where the user was in the file.  Only problem there was when the
result set was large, initial query time was significantly longer.  And that
happened a lot.

I then wrote a daemon in C to do the work and store the results in RAM.  The
PHP script connected to the daemon via a socket, and passed a request ID and
the numbers of the records it wanted.  Sure, it was convoluted, but I
actually got the speed up to where I was fairly happy with it.

If there's a better solution than that, I'm not aware of it.

But like someone else mentioned, it's not quite "practical" database usage.

--
Like to travel?                        http://TravTalk.org
Micah Yoder Internet Development       http://yoderdev.com


Re: Practical Cursors

From
merlyn@stonehenge.com (Randal L. Schwartz)
Date:
>>>>> "Micah" == Micah Yoder <yodermk@home.com> writes:

Micah> I then wrote a daemon in C to do the work and store the results
Micah> in RAM.  The PHP script connected to the daemon via a socket,
Micah> and passed a request ID and the numbers of the records it
Micah> wanted.  Sure, it was convoluted, but I actually got the speed
Micah> up to where I was fairly happy with it.

Micah> If there's a better solution than that, I'm not aware of it.

What you need is a "middleware" (heh) solution... I've done a number
of magazine columns using a Perl-based mini-web-server (search google
for "site:stonehenge.com HTTP::Daemon").  You could set up your
frontware begin-search CGI script to fire off a daemon, and tell it to
do the hard work, using a session ID to keep track of subsequent hits,
and let the front CGI request proxy through using HTTP to talk to the
daemon (very easy).  Or, you could redirect the client to the
mini-daemon directly.  I've done the latter a couple of times, and
it's pretty practical up to light-commercial-volume hits.

And before you scream "Perl is slow", I have a working Perl webserver
based on HTTP::Daemon that pre-forks just like Apache (taking current
load into consideration), handles streaming proxy connections
(including SSL proxy) and local file delivery, and benchmarks at
*half* the speed of Apache in both proxy and local file mode.  That's
much better than I initially expected, and completely satisfactory for
most applications.  And it's only 300 lines of code. :)

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: Practical Cursors

From
Jan Wieck
Date:
Micah Yoder wrote:
> (sorry to reply to a week-old message.  need to keep up with this list more!)
>
> On Monday 17 September 2001 17:04, you wrote:
>
> > There is an obvious benefit to the use of cursors within a persistent
> > environment. In other words, if my connection to the database is live, I
> > can increase my query and display efficiency through the use of a cursor.
> >
> > However, this seems to be useless within a web based environment. If we
> > have a live connection through a C++ application, we can perform a
> > transaction and interact within the results.
>
> Yep.  That seems to be a disadvantage with ALL database systems & HTTP based
> apps.
>
> I once wrote a MySQL app (I know, but that's what the company used) to do a
> fairly complicated search on a huge database full of domain names.  The query
> was time consuming (10-30 seconds) so it obviously could not be performed for
> every prev/next page request.
>
> My first approach was to have the PHP script write the entire data resultset
> to a fixed-length file, which could be easily accessed for each request to
> the point where the user was in the file.  Only problem there was when the
> result set was large, initial query time was significantly longer.  And that
> happened a lot.
>
> I then wrote a daemon in C to do the work and store the results in RAM.  The
> PHP script connected to the daemon via a socket, and passed a request ID and
> the numbers of the records it wanted.  Sure, it was convoluted, but I
> actually got the speed up to where I was fairly happy with it.
>
> If there's a better solution than that, I'm not aware of it.
>
> But like someone else mentioned, it's not quite "practical" database usage.

    Since  search  engines and data warehousing tend to have huge
    databases with sometimes complicated,  long  running  queries
    that  produce  empty to huge result sets, it's a quite common
    problem. Thus, I would consider any solution  that  leads  to
    success at first "practical".

    PHP together with cursors might be an alternate solution. You
    open a cursor for the entire result set. You have a  function
    that  fetches  the  next n rows from the cursor and generates
    the resulting html output in a file. It returns true if  more
    rows  have  been  found.  You  call it once and if it returns
    false display "No match found" or so. If it returns true, you
    call  it  again  to create a cache file for the second result
    page, and know if there  will  be  one  (telling  you  if  to
    provide a NEXT button). You register a shutdown function that
    will call the cache file generator another m times.  Now  you
    display  the  first  cache file, leave the DB connection with
    the open transaction and cursor where they are and exit.

    The user will already see the first result  page  while  your
    server  is  still  working.  After  calling  the  cache  file
    generator function m times, the shutdown function closes  the
    cursor,   terminates   the  transaction  and  closes  the  DB
    connection.

    I think 95% of users will not hit NEXT  more  than  10  times
    before  refining  their  search, so that should be enough. If
    one really does, well, than you'd  have  to  run  the  entire
    query  again  and this time create either more cache files or
    all of them.

    Now you need some sort of vacuum cleaner for the cache  files
    and are done.

    The  drawback  for  this solution is, that you don't know how
    many pages there will be in total when you display the  first
    one.   But   the   benefits   are  that  it  fit's  into  the
    connectionless HTTP nature, has a small  resource  footprint,
    provides  first  results  early  and  does  not  require open
    transactions over user interaction.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: Practical Cursors

From
Christopher Masto
Date:
On Tue, Sep 25, 2001 at 12:06:48AM -0400, Micah Yoder wrote:
> (sorry to reply to a week-old message.  need to keep up with this list more!)

Ditto, but more so.

> I then wrote a daemon in C to do the work and store the results in RAM.  The
> PHP script connected to the daemon via a socket, and passed a request ID and
> the numbers of the records it wanted.  Sure, it was convoluted, but I
> actually got the speed up to where I was fairly happy with it.
>
> If there's a better solution than that, I'm not aware of it.

A technique I've used with some success is to select the primary keys
from the rows you're interested in, and only have to memorize a list of
integers.  Then for each page, select the rows "WHERE pkey IN (...)".
It's sort of a middle ground as far as tradeoffs go.  You don't have to
store a huge amount of data in RAM or temporary files, but you still
have to do the work up front.

The problem I have with persistent per-session connections is that you
end up having basically the same
(per-transaction-overhead * simultaneous-transactions), and you add
(per-connection-overhead * simultaneous-open-sessions) on top.
There are certainly situations where you can do better one way or the
other.. figuring out how to best tune the per-session case scares me.
--
Christopher Masto

CB461C61 8AFC E3A8 7CE5 9023 B35D  C26A D849 1F6E CB46 1C61