Re: selects from large tables - Mailing list pgsql-performance

From Robert Treat
Subject Re: selects from large tables
Date
Msg-id 1037645312.21501.11.camel@camel
Whole thread Raw
In response to Re: selects from large tables  (Nikk Anderson <Nikk.Anderson@parallel.ltd.uk>)
List pgsql-performance
You might want to check out 7.3 while your at it.  It's currently
planned to be released around Dec 1st, which might fit in nicely with
your upgrade schedule.

Robert Treat

On Mon, 2002-11-18 at 11:36, Nikk Anderson wrote:
> Hi Tom,
>
> Yes, we should upgrade to 7.2 soon, its just that as it is a live system
> running 24x7 we are careful about upgrading core components so we do not
> disrupt our data collection agents too much.
>
> Here is some table info, we currently index by time then ID.  Generally,
> data will be selected by ID, then time range.  Clustering may help on
> this.
>
>
>     Attribute    |           Type           | Modifier
> -----------------+--------------------------+----------
>  job_id          | integer                  | not null
>  server_id       | integer                  | not null
>  time            | timestamp with time zone | not null
>  availability    | boolean                  |
>  connection_time | integer                  |
>  dns_setup       | integer                  |
>  server_response | integer                  |
>  frontpage_size  | integer                  |
>  frontpage_time  | integer                  |
>  transfer_size   | integer                  |
>  transfer_time   | integer                  |
>  error_id        | integer                  |
>  redirect_time   | integer                  |
>  polling_id      | integer                  | not null
> Indices: http_result_pk,
>          http_timejobid
>
> Thanks
>
> Nikk
>
>
> -----Original Message-----
> From: Tom Lane [ mailto:tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us> ]
> Sent: 18 November 2002 16:25
> To: Charles H. Woloszynski
> Cc: Nikk Anderson; 'Stephan Szabo'; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] selects from large tables
>
>
> "Charles H. Woloszynski" <chw@clearmetrix.com> writes:
> > Are you doing vaccums on these tables?  I was under the understanding
> > that the estimated row count should be close to the real row count
> > returned, and when it is not (as it looks in your case), the primary
> > reason for the disconnect is that the stats for the tables are
> > out-of-date.
>
> The fact that he's using 7.1 doesn't help any; the statistics mechanisms
>
> in 7.1 are pretty weak compared to 7.2.
>
> > Also, do you do any clustering of the data (since the queries are
> mostly
> > time limited)?  I am wondering if the system is doing lots of seeks to
>
> > get the data (implying that the data is all over the disk and not
> > clustered).
>
> It would also be interesting to try a two-column index ordered the other
>
> way (timestamp as the major sort key instead of ID).  Can't tell if that
>
> will be a win without more info about the data properties, but it's
> worth looking at.
>
>                         regards, tom lane
>




pgsql-performance by date:

Previous
From: Nikk Anderson
Date:
Subject: Re: selects from large tables
Next
From: Tom Lane
Date:
Subject: Re: for/loop performance in plpgsql ?