Home > mailing lists

Re: Help with query: indexes on timestamps - Mailing list pgsql-general

From	Keith C. Perry
Subject	Re: Help with query: indexes on timestamps
Date	July 7, 2004 18:22:30
Msg-id	1089224340.40ec3e946cf70@webmail.vcsn.com Whole thread Raw
In response to	Re: Help with query: indexes on timestamps (Richard Huxton <dev@archonet.com>)
List	pgsql-general

Tree view

Quoting Richard Huxton <dev@archonet.com>:

> Keith C. Perry wrote:
> >
> > I have a table with with 1 million records in it.  Here is the definition
> >
> > CREATE TABLE report
> > (
> >   match int4,
> >   action varchar(16),
> >   stamp timestamptz,
> >   account varchar(32),
> >   ipaddress inet,
> >   profile varchar(16),
> >   rating text,
> >   url text
> > )
> > WITHOUT OIDS;
> >
> > The is one index:
> >
> > CREATE INDEX stamp_idx
> >   ON report
> >   USING btree
> >   (stamp);
> >
> > That query I'm running is:
> >
> >  SELECT date_part('hour'::text, report.stamp) AS "hour", count(*) AS count
> >    FROM report
> >   GROUP BY date_part('hour'::text, report.stamp)
> >   ORDER BY date_part('hour'::text, report.stamp);
>
> You will always get a sequential scan with this query - there is no
> other way to count the rows.
>
> With PostgreSQL being MVCC based, you can't know whether a row is
> visible to you without checking it - visiting the index won't help. Even
> if it could, you'd still have to visit every row in the index.
>
> Assuming the table is a log, with always increasing timestamps, I'd
> create a summary table and query that.

Yea, actually it a proxy server log each month the databasae is 500k records.  I
have two months loaded only to put some stress on the server.  Some ever month
I'm loading the data just so I can do some analysis.  The optimization question
came up when one of the other database folks wanted to play with the database in
MS-SQL server.

How can I add a column that respresents a function that returns just the
date_part?  I wondering if that will increase the speed of the query in similar
fashion as the MS-SQL did.

I hadn't though about the MVCC vs. file locking issue.  The MS-SQL server does
not have any load on it and I'm sure if other users were hitting it the same
table with the same query, PG would be perform better.

--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

pgsql-general by date:

From: Anton Nikiforov
Date: 07 July 2004, 18:14:57
Subject: Re: Getting user who fired a trigger

From: Doseok Kim
Date: 07 July 2004, 18:37:39
Subject: (no subject)

Re: Help with query: indexes on timestamps - Mailing list pgsql-general

Previous

Next