Re: Maybe a strange question, but: "How long may a cursor live?" - Mailing list pgsql-general

From Tim Kientzle
Subject Re: Maybe a strange question, but: "How long may a cursor live?"
Date
Msg-id 3A0B4B31.46BCE831@acm.org
Whole thread Raw
In response to Maybe a strange question, but: "How long may a cursor live?"  (Christian Fritze <The.Finn@sprawl.de>)
Responses Re: Re: Maybe a strange question, but: "How long may a cursor live?"  (Alfred Perlstein <bright@wintelcom.net>)
List pgsql-general
A couple of ideas for you:

Experiment with doing the GROUP BY within your code.
Depending on a number of factors, it's sometimes faster.

Experiment with doing the ORDER BY within your code.
I've seen several cases where pulling the data into
memory and sorting there was much, much faster than
having the database do the sorting.

Experiment with moving the text_cat and func1 operations
into your Java code (rather than asking the DB to do them).
This may or may not help, depending on a number of factors.

The startup cost you're seeing is basically PG generating
all of the results, GROUP BY, and ORDER BY and tucking
the final answer away somewhere temporarily.  The cursor
then just steps through this temporary table.

You'd probably gain a lot by building such a temporary
table explicitly.  That is, create a new "summary" table
and periodically do the SELECT below and put the results
into the temporary table.  (You can even use a separate
program outside of your web application that's run from 'cron')
That way, you avoid the overhead of summarizing 80,000 results
on every query.  I've used this approach to provide rapid access
to web site logs (tens of millions of records summarized down
to a few thousand entries).

You mention that table1 has 80000 rows but didn't mention how
many rows there were after the aggregation (that is, how
many distinct values of textattrib2 there were).  For high-performance
web applications, I've been just storing full data results in
memory.  You've talked about using a bean to keep a DB cursor
around; why not just store the results?  This looks a lot like
a simple SortedMap from textattrib2 to textattrib1/count pairs.
That's easy to store in memory.  And it's hard to beat the speed.

            - Tim Kientzle


Christian Fritze <The.Finn@sprawl.de> writes:
>
> I'm working on a web based application (using gnuJSP / JDBC)
> that needs to do queries like
>
> SELECT count(textattrib1), text_cat(textattrib1 || ' '), func1(textattrib2)
> FROM table1 WHERE textattrib2 >= 'foo' GROUP BY textattrib2 ORDER BY
> textattrib2 ASC LIMIT somelimit;
>
> with table1 holding roughly 80000 rows (probably growing some day),
> textattrib2 is indexed of course.
>
> Unfortunately the query above is intolerably slow (up to 30 seconds or
> so, 2 would be OK...).

pgsql-general by date:

Previous
From: Gunnar R|nning
Date:
Subject: Re: Full text searching?
Next
From: Alfred Perlstein
Date:
Subject: Re: Re: Maybe a strange question, but: "How long may a cursor live?"