Re: Need help with one query - Mailing list pgsql-performance

From Robert Haas
Subject Re: Need help with one query
Date
Msg-id 603c8f070903201204s53f2a3cvbf9afc8649b2201a@mail.gmail.com
Whole thread Raw
In response to Re: Need help with one query  (Anne Rosset <arosset@collab.net>)
Responses Re: Need help with one query
Re: Need help with one query
List pgsql-performance
On Fri, Mar 20, 2009 at 1:16 PM, Anne Rosset <arosset@collab.net> wrote:
> Richard Huxton wrote:
>> Anne Rosset wrote:
>>> EXPLAIN ANALYZE
>>> SELECT
>>>  audit_change.id             AS id,
>>>  audit_change.audit_entry_id AS auditEntryId,
>>>  audit_entry.object_id       AS objectId,
>>>  audit_change.property_name  AS propertyName,
>>>  audit_change.property_type  AS propertyType,
>>>  audit_change.old_value      AS oldValue,
>>>  audit_change.new_value      AS newValue,
>>>  audit_change.flexfield      AS flexField
>>> FROM
>>>  audit_entry audit_entry, audit_change audit_change
>>> WHERE
>>>  audit_change.audit_entry_id = audit_entry.id
>>>  AND audit_entry.object_id = 'artf414029';
>>>
>>
>> [query reformatted to make it more readable]
>>
>> Not quite clear why you are aliasing the tables to their own names...
>>
>>
>>>
>>>
>>>
---------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> Hash Join  (cost=8.79..253664.55 rows=4 width=136) (actual
>>> time=4612.674..6683.158 rows=4 loops=1)
>>>  Hash Cond: ((audit_change.audit_entry_id)::text =
>>> (audit_entry.id)::text)
>>>  ->  Seq Scan on audit_change  (cost=0.00..225212.52 rows=7584852
>>> width=123) (actual time=0.009..2838.216 rows=7584852 loops=1)
>>>  ->  Hash  (cost=8.75..8.75 rows=3 width=45) (actual time=0.049..0.049
>>> rows=4 loops=1)
>>>       ->  Index Scan using audit_entry_object on audit_entry
>>> (cost=0.00..8.75 rows=3 width=45) (actual time=0.033..0.042 rows=4 loops=1)
>>>             Index Cond: ((object_id)::text = 'artf414029'::text)
>>> Total runtime: 6683.220 ms
>>>
>>
>> Very odd. It knows the table is large and that the seq-scan is going to
>> be expensive.
>>
>> Try issuing "set enable_seqscan = off" and run the explain analyse
>> again. That should show the cost of using the indexes.
>>
>>
>
> With "set enable_seqscan = off":
>
> QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------
> Nested Loop (cost=11.35..12497.53 rows=59 width=859) (actual
> time=46.074..49.742 rows=7 loops=1)
> -> Index Scan using audit_entry_pk on audit_entry (cost=0.00..7455.95
> rows=55 width=164) (actual time=45.940..49.541 rows=2 loops=1)
> Filter: ((object_id)::text = 'artf1024'::text)
> -> Bitmap Heap Scan on audit_change (cost=11.35..90.93 rows=59 width=777)
> (actual time=0.086..0.088 rows=4 loops=2)
> Recheck Cond: ((audit_change.audit_entry_id)::text = (audit_entry.id)::text)
> -> Bitmap Index Scan on audit_change_entry (cost=0.00..11.33 rows=59
> width=0) (actual time=0.076..0.076 rows=4 loops=2)
> Index Cond: ((audit_change.audit_entry_id)::text = (audit_entry.id)::text)
> Total runtime: 49.801 ms
>
>
> The db version is 8.2.4
>
> We are wondering if it is because of our audit_entry_id's format (like
> 'adte1DDFEA5B011C8988C3928752').  Any inputs?
> Thanks,
> Anne

Something is wrong here.  How can setting enable_seqscan to off result
in a plan with a far lower estimated cost than the original plan?  If
the planner thought the non-seq-scan plan is cheaper, it would have
picked that one to begin with.

...Robert

pgsql-performance by date:

Previous
From: Frank Joerdens
Date:
Subject: Re: Full statement logging problematic on larger machines?
Next
From: Alvaro Herrera
Date:
Subject: Re: Need help with one query