Re: Query planner question - Mailing list pgsql-general

From David G Johnston
Subject Re: Query planner question
Date
Msg-id 1408588014359-5815661.post@n5.nabble.com
Whole thread Raw
In response to Query planner question  (Soni M <diptatapa@gmail.com>)
Responses Re: Query planner question
List pgsql-general
Soni M wrote
> Hi Everyone,
>
> I have this query :
>
> select t.ticket_id ,
> tb.transmission_id
> from ticket t,
> transmission_base tb
> where t.latest_transmission_id = tb.transmission_id
> and t.ticket_number = tb.ticket_number
> and tb.parse_date > ('2014-07-31');
>
> Execution plan: http://explain.depesz.com/s/YAak
>
> Indexes on ticket :
>     "ticket_pkey" PRIMARY KEY, btree (ticket_id) CLUSTER
>     "ticket_by_latest_transmission" btree (latest_transmission_id)
>     "ticket_by_ticket_number" btree (ticket_number)
>
> This query only returns some portions of rows from ticket table.
> The question is, Why does postgres need to get all the rows from ticket
> table in order to complete this query?
> Can't postgres use indexes to get only needed rows on ticket table?
>
> I try set seqscan to off, but still index scan try to get all rows on
> ticket table.
> Here's the execution plan : http://explain.depesz.com/s/abH2

Short answer: you haven't defined "(latest_transmission_id, ticket_number)"
as being a foreign key onto the transmission_base table yet you seem to want
it to act like one.

Because of this failure the planner considers the following:

Nested Looping over 380,000 records is going to suck so it tries some
advanced "merge/join" techniques to try and speed things up.  In any such
alternative the entire ticket table needs to be considered since there is no
constraint provided for that table - the only constraint in on
transmission_base and it rightly is using an index to find records matching
the where clause.

Since ticket_number and latest_transmission_id are found in separate indexes
I do not believe the planner can make use of an Index Only scan to fulfill
the join so each index lookup would require a corresponding heap lookup
which means extra work compared to just sequentially scanning the heap in
the first place.  Since it is going to hit the entire thing in either case
the sequential scan is the logical choice for it to make.

Others will correct any factual mistakes I may have made - I am theorizing
here and do not understand the planner sufficient well to be 100% certain
that an FK definition will solve the problem.

David J.





--
View this message in context: http://postgresql.1045698.n5.nabble.com/Query-planner-question-tp5815659p5815661.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


pgsql-general by date:

Previous
From: Soni M
Date:
Subject: Query planner question
Next
From: Jov
Date:
Subject: Re: Linux replication to FreeBSD problem