Re: Clarification reqeusted for "select * from a huge table" - Mailing list pgsql-hackers

From Gokulakannan Somasundaram
Subject Re: Clarification reqeusted for "select * from a huge table"
Date
Msg-id 9362e74e0711120435r9e90968ib3514cc2d9786f00@mail.gmail.com
Whole thread Raw
In response to Re: Clarification reqeusted for "select * from a huge table"  (Richard Huxton <dev@archonet.com>)
Responses Re: Clarification reqeusted for "select * from a huge table"  ("Gokulakannan Somasundaram" <gokul007@gmail.com>)
Re: Clarification reqeusted for "select * from a huge table"  (Richard Huxton <dev@archonet.com>)
List pgsql-hackers
On Nov 12, 2007 5:25 PM, Richard Huxton <dev@archonet.com> wrote:
> Gokulakannan Somasundaram wrote:
> > Hi,
> >     I had a chance to test one of the real world cases with Oracle and
> > PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
> > machine) both in oracle and Postgresql. Just write a JDBC program for
> > a 'select *' on that table. With PostgreSQL as backend, java crashes
> > saying that it has met 'Out Of Memory'. With Oracle it doesn't.
> >     Postgres tried to send all the results back to the client at one
> > shot, whereas in Oracle it works like a Cursor. Is this issue already
> > well known among hackers community? If known, why is it designed this
> > way?
>
> It returns everything because you've asked for it. If you wanted
> something that looks like a cursor, PG assumes you'll request a cursor.
>
> As to why, there are two reasons:
> 1. It's always been that way and changing it now would irritate most of
> the existing user-base.
JDBC API, as you might know has a rs.next() and rs.prev() to scan
backwards and forwards. The API looks more like a cursor.Currently,
all the results for Postgres is returned to the client during the
executeQuery(). This is more like the SAX / DOM argument. I just feel
JDBC APIs provide a feeling a SAX.

> 2. Repeat your test with 5,10,50,100 clients all running different big
> queries and see which puts a greater load on the server. PG favours
> supporting lots of clients by pushing the load onto them.
When we say Postgres pushes the load to the client, its a huge memory
overhead on the  client part. In oracle's scenario, the overhead is
just maintaining the state(say from which block the scan should be
continued). My point is that there are some operations, which are not
possible with postgres, whereas it is possible by an another database.
It would be better,, if we can support it. (There are some places
where DOM parsing is not possible and we prefer SAX)


>
> >     I also noticed that it doesn't crash with psql, but it takes a
> > long time to show the first set of records. It takes a long time, even
> > to quit after i pressed 'q'.
> >    With oracle SQLPlus, it is quite instantaneous.
>
> Again, you're measuring different things. What is the time to the *last*
> row?

I made this point, because people usually fire select * from table
query in the psql prompt to get a feel of the table. Ofcourse they can
fire select * from table limit 10; But i just feel its more key
strokes and it would be better, if it is interactive. I accept that it
is a debatable point and people can prefer otherwise to have more key
strokes.

Imagine, you need a large batch operation. In oracle we can fire the
SQL and we can be sure that the client won't crash, but with postgres
we have a region of uncertainity. There are some JDBC hints like
setFetchSize(), which actually affects Oracle's behaviour. But it
doesn't seem to do anything with postgres. But JDBC has declared these
commands as hint commands and has provided a warning to users, about
the fact that it may get ignored



-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Clarification reqeusted for "select * from a huge table"
Next
From: "Gokulakannan Somasundaram"
Date:
Subject: Re: Clarification reqeusted for "select * from a huge table"