how do functions affect query plan? - Mailing list pgsql-performance

From 常超
Subject how do functions affect query plan?
Date
Msg-id BAY174-W47F6408BADFE4011010B7F84360@phx.gbl
Whole thread Raw
Responses Re: how do functions affect query plan?
List pgsql-performance
Hi,all
I have a table to save received measure data.


CREATE TABLE measure_data
(
  id serial NOT NULL,
  telegram_id integer NOT NULL,
  measure_time timestamp without time zone NOT NULL,
  item_id integer NOT NULL,
  val double precision,
  CONSTRAINT measure_data_pkey PRIMARY KEY (id)
);

CREATE INDEX index_measure_data_telegram_id ON measure_data USING btree (telegram_id);


in my scenario,a telegram contains measure data for multiple data items and timestamps,
BTW,another table is for telegram.

The SQL I used in my application is
  select * from measure_data where telegram_id in(1,2,...,n)
and this query used the index_measure_data_telegram_id index,as expected.

In order to see the performance of my query ,
I used the following query to search the measure data for randomly 30 telegrams.


explain analyze
SELECT md.*
  FROM measure_data md
  where telegram_id in
    (
         SELECT distinct
         trunc((132363-66484) * random() + 66484)
         FROM generate_series(1,30) as s(telegram_id)
    )
  ;

the 132363 and 66484 are the max and min of the telegram id,separately.

What surprised me is that index is not used,instead,a seq scan is performed on measure_data.
Although,intuitively,in this case,it is much wiser to use the index.
Would you please give some clue to why this happened?

"Hash Semi Join  (cost=65.00..539169.32 rows=10277280 width=28) (actual time=76.454..17177.054 rows=9360 loops=1)"
"  Hash Cond: ((md.telegram_id)::double precision = (trunc(((65879::double precision * random()) + 66484::double
precision))))"
"  ->  Seq Scan on measure_data md  (cost=0.00..356682.60 rows=20554560 width=28) (actual time=0.012..13874.809
rows=20554560loops=1)" 
"  ->  Hash  (cost=52.50..52.50 rows=1000 width=8) (actual time=0.062..0.062 rows=30 loops=1)"
"        Buckets: 1024  Batches: 1  Memory Usage: 2kB"
"        ->  HashAggregate  (cost=22.50..42.50 rows=1000 width=0) (actual time=0.048..0.053 rows=30 loops=1)"
"              ->  Function Scan on generate_series s  (cost=0.00..20.00 rows=1000 width=0) (actual time=0.020..0.034
rows=30loops=1)" 
"Total runtime: 17177.527 ms"



pgsql-performance by date:

Previous
From: Craig James
Date:
Subject: Stats collector constant I/O
Next
From: David G Johnston
Date:
Subject: Re: how do functions affect query plan?