Re: Reading data in bulk - help? - Mailing list pgsql-performance

From Chris Huston
Subject Re: Reading data in bulk - help?
Date
Msg-id B82EC107-E3D1-11D7-AAEC-000393011B1A@bangjafwac.com
Whole thread Raw
In response to Re: Reading data in bulk - help?  ("Magnus Naeslund(w)" <mag@fbab.net>)
Responses Re: Reading data in bulk - help?
Re: Reading data in bulk - help?
List pgsql-performance
On Wednesday, Sep 10, 2003, at 11:16 America/Denver, Josh Berkus wrote:

> What about explaining your disk setup?   Or mentioning
> postgresql.conf?   For
> somebody who wants help, you're ignoring a lot of advice and questions.
>
> Personally, I'm not going to be of any further help until you report
> back on
> the other 3 of 4 options.

EEEK! Peace. Sorry I didn't include that info in the response.

1) Memory - clumsily adjusted shared_buffer - tried three values: 64,
128, 256 with no discernible change in performance. Also adjusted,
clumsily, effective_cache_size to 1000, 2000, 4000 - with no
discernible change in performance. I looked at the Admin manual and
googled around for how to set these values and I confess I'm clueless
here. I have no idea how many kernel disk page buffers are used nor do
I understand what the "shared memory buffers" are used for (although
the postgresql.conf file hints that it's for communication between
multiple connections). Any advice or pointers to articles/docs is
appreciated.

2) Clustering - tried it - definite improvement - thanks for the tip

3) RAID - haven't tried it - but I'm guessing that the speed
improvement from a RAID 5 may be on the order of 10x - which I can
likely get from using something like HDF. Since the data is unlikely to
grow beyond 10-20gig, a fast drive and firewire ought to give me the
performance I need. I know experimentally that the current machine can
sustain a 20MB/s transfer rate which is 20-30x the speed of these
queries. (If there's any concern about my enthusiasm for postgres - no
worries - I've been very happy with it on several projects - it might
not be the right tool for this kind of job - but I haven't come to that
conclusion yet.)

4) I'd previously commented out the output/writing steps from the app -
to isolate read performance.

On Wednesday, Sep 10, 2003, at 05:47 America/Denver, Magnus Naeslund(w)
wrote:
>
> How are you fetching the data?
> If you are using cursors, be sure to fetch a substatial bit at a time
> so
> that youre not punished by latency.
> I got a big speedup when i changed my original clueless code to fetch
> 64
> rows in a go instead of only one.
That's an excellent question... I hadn't thought about it. I'm using a
JDBC connection... I have no idea (yet) how the results are moving
between postgres and the client app. I'm testing once with the app and
the DB on the same machine (to remove network latency) and once with
db/app on separate machines. However, I wonder if postgres is blocking
on network io (even if it's the loopback interface) and not on disk?!

I'll definitely look into it. Maybe I'll try a loop in psql and see
what the performance looks like. Thanks Magnus.

On Wednesday, Sep 10, 2003, at 07:05 America/Denver, Sean McCorkle
wrote:

>     I ended up solving the problem by going "retro" and using the
>     quasi-database functions of unix and flat files:  grep, sort,
>     uniq and awk.
That's an cool KISS approach. If I end up moving out of postgres I'll
speed test this approach against HDF. Thanks.


This is a very helpful list,
- Chris


pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Query too slow
Next
From: William Yu
Date:
Subject: Re: Reading data in bulk - help?