Re: Postmaster processes running out of control? - Mailing list pgsql-general

From Nigel J. Andrews
Subject Re: Postmaster processes running out of control?
Date
Msg-id Pine.LNX.4.21.0203221346350.6141-100000@ponder.fairway2k.co.uk
Whole thread Raw
In response to Re: Postmaster processes running out of control?  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Postmaster processes running out of control?  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
On Fri, 22 Mar 2002, Martijn van Oosterhout wrote:

> On Fri, Mar 22, 2002 at 01:11:11AM +0000, Nigel J. Andrews wrote:
> > Good point, I hadn't tried it since the upgrade becuase that wasn't why I
> > upgraded (don't worry I've got a _long_ post on that subject waiting to be
> > sent), ...
> > [sniped]
> > ...
> > the EXPLAIN output because it's not interesting and it would almost be
> > an overlap with the contents of my long, pending post.
>
> Hmm, with something that matches most of the table, it would likely choose a
> sequential scan which can take quite a while over a large table

Yep, doing a seqscan does take a while and I can see why it does one most of
the time. This is the subject of my long, pending post, which is no longer
pending as I have recently sent it to the list.

> >[sniped]
> >
> If you're selecting on one column (poster_name) and sorting on another
> (time) it may help to have an index on both (time,poster_name) since that
> avoids the sort step. (I hope 7.2 estimates sort costs better than earlier
> versions).

Thank you, it didn't occur to me that the two column index would ease the
sorting. Would the order of the columns specified in the index creation be
significant?

> However, if you really want the whole output (rather than say the first 100
> lines) and that really involves trawling a majority of the table, then you
> are simply bound by disk transfer speed.

I have coded the application to use all the data, caching the pages (it's a web
site) not requested but which use the results of the query, eg. pages 2 to xxx
of the results when page 1 has been requested by the user. I'm happy this way
since the query hit is taken only once instead of several dozen times and I
have a fancy caching system that fits in nicely with data update frequency. In
fact, that was one of the driving forces behind my fancy caching system. The
I/O is a definitely a bottleneck. There's almost continuous disk activity with
a change in it's 'character' marking the switch from backend query processing
to page cache filling.

>
> Alternativly, using a cursor may allow you to start outputting data before
> the query has finished.

At the moment the application just issues the query and waits for the results.
I had considered changing this more in order to reduce the memory usage within
the client than as a speeding up means. I think such a change makes the split
between content and presentation harder to maintain though, possible but
harder, or shall we say more sophisticated.


--
Nigel J. Andrews
Logictree Systems Limited


pgsql-general by date:

Previous
From: Heiko Klein
Date:
Subject: Re: How to perform an identical insert?
Next
From: Maarten.Boekhold@reuters.com
Date:
Subject: Have you seen these hackers: Uncle George