Re: Index not used on group by - Mailing list pgsql-performance

From Андрей Репко
Subject Re: Index not used on group by
Date
Msg-id 4010683035.20050927143731@sart.must-ipra.com
Whole thread Raw
In response to Index not used on group by  (Andrey Repko <repko@sart.must-ipra.com>)
Responses Re: Index not used on group by
List pgsql-performance
Здравствуйте Richard,

Tuesday, September 27, 2005, 2:08:31 PM, Вы писали:


>> sart_ma=# EXPLAIN ANALYZE SELECT alias_id FROM ma_data GROUP BY alias_id;
>>                                                        QUERY PLAN
>>
-------------------------------------------------------------------------------------------------------------------------
>>  HashAggregate  (cost=38565.30..38565.62 rows=32 width=4)
>> (actual time=15990.863..15990.933 rows=32 loops=1)
>>    ->  Seq Scan on ma_data  (cost=0.00..37811.24 rows=301624
>> width=4) (actual time=3.446..14572.141 rows=301624 loops=1)
>>  Total runtime: 15991.244 ms

RH> OK - the planner thinks it's doing the right thing, your cost estimates
RH> are way off. If you look back at where you got an index-scan, it's cost
RH> was 1.1 million.
RH>    Index Scan using reference_9_fk on ma_data (cost=0.00..1139526.57
But why PG scan _all_ the records in the table? As I understand we can
"just" select information from index, not scaning all the table? Of
course if we select ALL records from table index can't help us.
If I write something like:
SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =1 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =2 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =3 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =4 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =5 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =6 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =7 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =8 LIMIT 1)
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id =9 LIMIT 1)
...
UNION ALL SELECT (SELECT  alias_id  FROM ma_data WHERE alias_id=max_alias_id LIMIT 1)
It works better, much better.

RH> That's way above the numbers for seq-scan+hash/sort, so if the cost
RH> estimate was right PG would be making the right choice. Looks like you
RH> need to check your configuration settings. Have you read:
RH>    http://www.powerpostgresql.com/PerfList
RH> or
RH>    http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
Thanks.


--
С наилучшими пожеланиями,
  Репко Андрей Владимирович       mailto:repko@sart.must-ipra.com


pgsql-performance by date:

Previous
From: Андрей Репко
Date:
Subject: Re: Index not used on group by
Next
From: Gnanavel S
Date:
Subject: Re: PostgreSQL overall design