Re: best way to fetch next/prev record based on index - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: best way to fetch next/prev record based on index
Date
Msg-id 6EE64EF3AB31D5448D0007DD34EEB34101AF13@Herge.rcsinc.local
Whole thread Raw
In response to best way to fetch next/prev record based on index  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
List pgsql-performance
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> > I would much rather see postgres 'get' (a,b,c) > (a1,
> > b1, c1)...if there is even a chance this is possible, I'll direct my
> > efforts there.
>
> For the ISAM context this whole discussion is kinda moot, because you
> really don't want to have to plan and execute a fairly expensive query
> for every record fetch.  If we had all the improvements discussed, it
> would be a reasonably cheap operation by the standards of "execute
> an independent query", but by the standards of "fetch next record
> in my query" it'll still suck.  (Using PREPARE might help, but only
> somewhat.)
>
> It strikes me that what you really want for ISAM is to improve the
> cursor mechanism so it will do the things you need.  I'm not sure
> what's involved, but let's talk about that angle for a bit.

Sorry for the long post, but here's an explanation of why I think things
are better off where they are.

I've created a middleware that translates ISAM file I/O on the fly to
SQL and uses prepared statements over parse/bind to execute them.  This
is why I was so insistent against scoping prepared statement lifetime to
transaction level.
Cursors seem attractive at first but they are a decidedly mixed bag.
First of all, PostgreSQL cursors are insensitive, which absolutely
precludes their use.  Supposing they weren't though, I'm not so sure I'd
use them if it was possible to do the same things via vanilla queries.

It turns out that prepared statements get the job done.  Moving them to
parse/bind got me a 30% drop in server cpu time and statement execution
times run between .1 and .5 ms for random reads.  Sequential reads go
from that fast to slower depending on index performance.  So, I don't
have performance issues except where the index doesn't deliver.

2000-10000 reads/sec is competitive with commercial ISAM filesystems on
the pc (assuming application is not running on the server), and it is
far better than any other commercial ISAM emulation I've played with up
to this point.  Don't have concrete #s, but the server can deliver 2-3
times that in concurrency situations, in many cases the application
performance is actually network bound...this is all on pc hardware.  Of
course, mainframes can do much, much better than this but that's really
outside the scope of what I'm trying to do.

So, things run pretty fast as they are, and keeping things running
through queries keeps things generic and flexible.  Also, queries
consume zero resources on the server except for the time it takes to
process and deal with them.  Cursors, OTOH, have to be opened up for
every table for every user.  Cursors are read-only always, whereas views
can be updated with rules.  It's worth noting that other commercial ISAM
emulation systems (for example Acu4GL by AcuCorp) cut queries on the fly
as well, even when cursor options are available.

If cursors became sensitive, they would be worth consideration.  I've
never really had 1000 large ones open at once, be interesting to see how
that worked.  In ideal object would be a 'pseudo cursor', an insensitive
cursor shared by multiple users with each having their own record
pointer.  This might be better handled via middleware though (this might
also give better options handling different locking scenarios).

Merlin

pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Re: best way to fetch next/prev record based on index
Next
From: 8lhhxba02@sneakemail.com
Date:
Subject: Index works with foo.val=bar.val but not foo.val<=bar.val