difficulties with time based queries - Mailing list pgsql-performance

From Rainer Mager
Subject difficulties with time based queries
Date
Msg-id 003001c9b645$e8a7cec0$b9f76c40$@com
Whole thread Raw
Responses Re: difficulties with time based queries  (David Wilson <david.t.wilson@gmail.com>)
Re: difficulties with time based queries  (PFC <lists@peufeu.com>)
Re: difficulties with time based queries  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance

I have a somewhat large table (more than 100 million rows) that contains log data with start_time and end_time columns. When I try to do queries on this table I always find them slower than what I need and what I believe should be possible.

 

For example, I limited the following query to just a single day and it still is much slower than what I would expect. In reality I need to do queries that span a few weeks.

 

explain analyze select * from ad_log where date(start_time) < date('2009-03-31') and date(start_time) >= date('2009-03-30');

 

 

Bitmap Heap Scan on ad_log  (cost=73372.57..3699152.24 rows=2488252 width=32) (actual time=49792.862..64611.255 rows=2268490 loops=1)

   Recheck Cond: ((date(start_time) < '2009-03-31'::date) AND (date(start_time) >= '2009-03-30'::date))

   ->  Bitmap Index Scan on ad_log_date_all  (cost=0.00..72750.51 rows=2488252 width=0) (actual time=49776.332..49776.332 rows=2268490 loops=1)

         Index Cond: ((date(start_time) < '2009-03-31'::date) AND (date(start_time) >= '2009-03-30'::date))

 Total runtime: 65279.352 ms

 

 

The definition of the table is:

 

   Column   |            Type             |                         Modifiers

------------+-----------------------------+------------------------------------------------------------

 ad_log_id  | integer                     | not null default nextval('ad_log_ad_log_id_seq'::regclass)

 channel    | integer                     | not null

 player     | integer                     | not null

 ad         | integer                     | not null

 start_time | timestamp without time zone |

 end_time   | timestamp without time zone |

Indexes:

    "ad_log_pkey" PRIMARY KEY, btree (ad_log_id)

    "ad_log_unique" UNIQUE, btree (channel, player, ad, start_time, end_time)

    "ad_log_ad" btree (ad)

    "ad_log_ad_date" btree (ad, date(start_time))

    "ad_log_channel" btree (channel)

    "ad_log_channel_date" btree (channel, date(start_time))

    "ad_log_date_all" btree (date(start_time), channel, player, ad)

    "ad_log_player" btree (player)

    "ad_log_player_date" btree (player, date(start_time))

Foreign-key constraints:

    "ad_log_ad_fkey" FOREIGN KEY (ad) REFERENCES ads(id)

    "ad_log_channel_fkey" FOREIGN KEY (channel) REFERENCES channels(id)

    "ad_log_player_fkey" FOREIGN KEY (player) REFERENCES players_history(id)

Triggers:

    rollup_ad_logs_daily AFTER INSERT ON ad_log FOR EACH ROW EXECUTE PROCEDURE rollup_ad_logs_daily()

 

 

Any suggestions would be appreciated.

 

--Rainer

pgsql-performance by date:

Previous
From: Lists
Date:
Subject: Re: Best replication solution?
Next
From: David Wilson
Date:
Subject: Re: difficulties with time based queries