Re: overzealous sorting? - Mailing list pgsql-performance

From Marc Cousin
Subject Re: overzealous sorting?
Date
Msg-id 20110926113901.38fdf892@marco-dalibo
Whole thread Raw
In response to overzealous sorting?  (anthony.shipman@symstream.com)
Responses Re: overzealous sorting?
List pgsql-performance
Le Mon, 26 Sep 2011 16:28:15 +1000,
anthony.shipman@symstream.com a écrit :

> In Mammoth Replicator (PG 8.3) I have a table described as
>
>                                        Table "public.tevent_cdr"
>      Column     |           Type           |
> Modifiers
> ----------------+--------------------------+------------------------------------------------------------
> event_id       | integer                  | not null default
> nextval(('event_id_seq'::text)::regclass) timestamp      | timestamp
> with time zone | not null classification | character varying        |
> not null area           | character varying        | not null
>  kind           | character varying        |
>  device_id      | integer                  |
>  device_name    | character varying        |
>  fleet_id       | integer                  |
>  fleet_name     | character varying        |
>  customer_id    | integer                  |
>  customer_name  | character varying        |
>  event          | text                     |
> Indexes:
>     "tevent_cdr_event_id" UNIQUE, btree (event_id)
>     "tevent_cdr_timestamp" btree ("timestamp")
> Check constraints:
>     "tevent_cdr_classification_check" CHECK (classification::text
> = 'cdr'::text)
> Inherits: tevent
>
>
> This simple query puzzles me. Why does it need to sort the records?
> Don't they come from the index in order?
>
>  "explain analyze select * from tevent_cdr where timestamp >=
> '2011-09-09 12:00:00.000000+0' and timestamp < '2011-09-09
> 13:00:00.000000+0' and classification = 'cdr' order by timestamp;"
>
> QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Sort  (cost=9270.93..9277.12 rows=2477 width=588) (actual
> time=9.219..11.489 rows=2480 loops=1)
>     Sort Key: "timestamp"
>     Sort Method:  quicksort  Memory: 2564kB
>     ->  Bitmap Heap Scan on tevent_cdr  (cost=57.93..9131.30
> rows=2477 width=588) (actual time=0.440..3.923 rows=2480 loops=1)
>           Recheck Cond: (("timestamp" >= '2011-09-09
> 22:00:00+10'::timestamp with time zone) AND ("timestamp" <
> '2011-09-09 23:00:00+10'::timestamp with time zone))
>           Filter: ((classification)::text = 'cdr'::text)
>           ->  Bitmap Index Scan on tevent_cdr_timestamp
> (cost=0.00..57.31 rows=2477 width=0) (actual time=0.404..0.404
> rows=2480 loops=1)
>                 Index Cond: (("timestamp" >= '2011-09-09
> 22:00:00+10'::timestamp with time zone) AND ("timestamp" <
> '2011-09-09 23:00:00+10'::timestamp with time zone))
>   Total runtime: 13.847 ms
> (9 rows)

Because Index Scans are sorted, not Bitmap Index Scans, which builds a
list of pages to visit, to be then visited by the Bitmap Heap Scan step.

Marc.

pgsql-performance by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [PERFORMANCE] Insights: fseek OR read_cluster?
Next
From: Antonio Rodriges
Date:
Subject: Re: [PERFORMANCE] Insights: fseek OR read_cluster?