Re: best way to fetch next/prev record based on index - Mailing list pgsql-performance
From | Merlin Moncure |
---|---|
Subject | Re: best way to fetch next/prev record based on index |
Date | |
Msg-id | 6EE64EF3AB31D5448D0007DD34EEB34101AF13@Herge.rcsinc.local Whole thread Raw |
In response to | best way to fetch next/prev record based on index ("Merlin Moncure" <merlin.moncure@rcsonline.com>) |
List | pgsql-performance |
> "Merlin Moncure" <merlin.moncure@rcsonline.com> writes: > > I would much rather see postgres 'get' (a,b,c) > (a1, > > b1, c1)...if there is even a chance this is possible, I'll direct my > > efforts there. > > For the ISAM context this whole discussion is kinda moot, because you > really don't want to have to plan and execute a fairly expensive query > for every record fetch. If we had all the improvements discussed, it > would be a reasonably cheap operation by the standards of "execute > an independent query", but by the standards of "fetch next record > in my query" it'll still suck. (Using PREPARE might help, but only > somewhat.) > > It strikes me that what you really want for ISAM is to improve the > cursor mechanism so it will do the things you need. I'm not sure > what's involved, but let's talk about that angle for a bit. Sorry for the long post, but here's an explanation of why I think things are better off where they are. I've created a middleware that translates ISAM file I/O on the fly to SQL and uses prepared statements over parse/bind to execute them. This is why I was so insistent against scoping prepared statement lifetime to transaction level. Cursors seem attractive at first but they are a decidedly mixed bag. First of all, PostgreSQL cursors are insensitive, which absolutely precludes their use. Supposing they weren't though, I'm not so sure I'd use them if it was possible to do the same things via vanilla queries. It turns out that prepared statements get the job done. Moving them to parse/bind got me a 30% drop in server cpu time and statement execution times run between .1 and .5 ms for random reads. Sequential reads go from that fast to slower depending on index performance. So, I don't have performance issues except where the index doesn't deliver. 2000-10000 reads/sec is competitive with commercial ISAM filesystems on the pc (assuming application is not running on the server), and it is far better than any other commercial ISAM emulation I've played with up to this point. Don't have concrete #s, but the server can deliver 2-3 times that in concurrency situations, in many cases the application performance is actually network bound...this is all on pc hardware. Of course, mainframes can do much, much better than this but that's really outside the scope of what I'm trying to do. So, things run pretty fast as they are, and keeping things running through queries keeps things generic and flexible. Also, queries consume zero resources on the server except for the time it takes to process and deal with them. Cursors, OTOH, have to be opened up for every table for every user. Cursors are read-only always, whereas views can be updated with rules. It's worth noting that other commercial ISAM emulation systems (for example Acu4GL by AcuCorp) cut queries on the fly as well, even when cursor options are available. If cursors became sensitive, they would be worth consideration. I've never really had 1000 large ones open at once, be interesting to see how that worked. In ideal object would be a 'pseudo cursor', an insensitive cursor shared by multiple users with each having their own record pointer. This might be better handled via middleware though (this might also give better options handling different locking scenarios). Merlin
pgsql-performance by date: