Re: difficulties with time based queries - Mailing list pgsql-performance

From Rainer Mager
Subject Re: difficulties with time based queries
Date
Msg-id 003501c9b656$a9c73fe0$fd55bfa0$@com
Whole thread Raw
In response to Re: difficulties with time based queries  (David Wilson <david.t.wilson@gmail.com>)
Responses Re: difficulties with time based queries  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: difficulties with time based queries  (Matthew Wakeling <matthew@flymine.org>)
List pgsql-performance
Thanks for all the replies, I'll try to address the follow up questions:

> From: David Wilson [mailto:david.t.wilson@gmail.com]
>
> The stats look good and it's using a viable index for your query. What
> kind of hardware is this on, and what are the relevant postgresql.conf
> lines? (Or, for that matter, what does iostat say while this query's
> running?)

I'm running on Windows, so I don't have iostat, but perfmon tells me my Avg.
Disk Queue Length went up to 1.2 during the query (versus a normal value of
about 0.02). Also disk throughput was at about 1.2 MB/s during the query. I
don't know how much of this is random versus linear.



> From: PFC [mailto:lists@peufeu.com]
>
>     With this quantity of rows, you want to try to make the disk
> accesses as
> linear as possible.
>     This means your table should be organized on disk by date, at
> least
> roughly.
>     If your data comes from an import that was sorted on some other
> column,
> this may not be the case.
>
>     What kind of bytes/s do you get from the drives ?

The data should be mostly ordered by date. It is all logged in semi-realtime
such that 99% will be logged within an hour of the timestamp. Also, as
stated above, during this query it was about 1.2 MB/s, which I know isn't
great. I admit this isn't the best hardware in the world, but I would expect
better than that for linear queries.

>     Do you UPDATE or DELETE a lot from this table ? Is it vacuum'd
> enough ?

No, this table has no UPDATEs or DELETEs. It is auto vacuum'd, but no manual
vacuuming.

In regards to clustering, I'm hesitant to do that unless I have no other
choice. My understanding is that I would need to do periodic re-clustering
to maintain it, and during that time the table is very busy.


> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Hmm ... it's pretty unusual to see the index fetch portion of a bitmap
> scan take the bulk of the runtime.  Usually that part is fast and where
> the pain comes is in fetching from the heap.   I wonder whether that
> index has become bloated.  How big are the table and the index
> physically?  (Look at pg_class.relpages, or if you want a really
> accurate number try pg_relation_size().)

Can you give me some more info on how to look at these stats? That is,
what/where is pg_class.relpages, etc. I'll also do some searching for this
info.

> What Postgres version is this, exactly?

8.3.3

> BTW, I think you've gone way overboard in your indexing of this table;
> those indexes are certainly consuming well more space than the table
> does, and a lot of them are redundant.

Agreed, I need to look carefully at all of the queries we do on this table
and reduce this.



--Rainer


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: difficulties with time based queries
Next
From: Tom Lane
Date:
Subject: Re: difficulties with time based queries