Re: Handling data in libpq - Mailing list pgsql-interfaces

From am@fx.ro
Subject Re: Handling data in libpq
Date
Msg-id 20021019184725.A313@coto
Whole thread Raw
In response to Re: Handling data in libpq  ("Jeroen T. Vermeulen" <jtv@xs4all.nl>)
Responses Re: Handling data in libpq
List pgsql-interfaces
First of all, thanks for your answer.

On Sat, Oct 19, 2002 at 01:01:36PM +0200, Jeroen T. Vermeulen wrote:
> On Sat, Oct 19, 2002 at 01:23:52PM +0300, am@fx.ro wrote:
> 
> > I'm looking for a new approach regarding fetching the data.
> > Usage of cursors is probably the path to follow. I would be deeply 
> > grateful to anyone who would share a bit of his/her experience:
> > 
> > - i assume the table has 50000 tuples
> > - what is better:  to fetch one tuple at one time, or to fetch
> >   a certain number of tuples (let's say 100).
> 
> Obviously there is some query and communications overhead associated with
> fetching anything, regardless of size, and the odds are it's going to be 
> larger than the cost of sending a tuple of actual data across.  So if
> you're going to pay that overhead anyway, you'll want to get some decent
> amount of data for it.

Hmm.. Good point.  
So, you're saying that fetching one tuple at one time is out of discussion. 

> However, that also implies that fetching 500 blocks of 100 tuples each is 
> always going to be slower than getting 50000 tuples in one go. The two
> advantages of fetching smaller blocks are: (1) you can start processing
> some data right away, which might be useful for e.g. user interaction
> (or you can use asynchronous communication so you can process your last
> block of data while fetching the next); 

Well, the program doesn't do automatically data processing: 
it only allows the user do see and modify the data.

>and (2) you may be running out of
> memory when fetching and processing 50000 tuples at once.  Swapping can
> be a real drag on performance.

There is also another aspect: usually the user will not
browse all of those tuples ( the program allows the user to specify
custom WHERE and ORDER BY clauses, so that he/she can select only 
the interesting rows ; and only sometimes will scroll all tuples ).

The conclusion is that the number of rows fetched at one time 
should be chosen carefully. 

> > - if i fetch 100 tuples, PQntuples returns 100? how do i know how
> >   many rows has the table?  
> 
> Well, if you assume your table has 50000 rows, why do you need to ask?

This assumption is only for that table. I want that my program
works well even for larger tables ( most of the program's classes
and functions are independent from the actual table, and will be 
moved to a library for later use).

> Seriously though, you don't get this information.  But if an estimate
> is good enough, you can "select count(*) from table" before you start
> fetching data.

I see: the returned number is exact as long as another user doesn't 
add/delete tuples after the 'select count(*)' and before the fetch.
I thought there was a way to find out how many tuples would return
a FETCH ALL, without executing that command.
> > - i have the impression that keeping track of the tuple's index i
> >   nside the RecordSet and the index inside the whole table is quite 
> >   tricky, but manageable. How do you guys handle this kind of situations?
> 
> It's not so hard.  Just an extra nested loop.

Adrian Maier
(am@fx.ro)


pgsql-interfaces by date:

Previous
From: "Jeroen T. Vermeulen"
Date:
Subject: Re: Handling data in libpq
Next
From: "John L. Turner"
Date:
Subject: Re: Handling data in libpq