Re: Performance problem with semi-large tables - Mailing list pgsql-performance

From PFC
Subject Re: Performance problem with semi-large tables
Date
Msg-id opsld0ul0nth1vuj@musicbox
Whole thread Raw
In response to Re: Performance problem with semi-large tables  ("Ken Egervari" <ken@upfactor.com>)
List pgsql-performance
> So are you suggesting as a general rule then that sub-queries are the
> way to force a specific join order in postgres?  If that is the case, I
> will do this from now on.

    I'll try to explain a bit better...
    Here's your original query :

> select s.*, ss.*
> from shipment s, shipment_status ss, release_code r
> where s.current_status_id = ss.id
>    and ss.release_code_id = r.id
>    and r.filtered_column = '5'
> order by ss.date desc
> limit 100;

    If you write something like :

SELECT * FROM shipment_status WHERE release_code_id = constant ORDER BY
release_code_id DESC, date DESC LIMIT 100;

    In this case, if you have an index on (release_code_id, date), the
planner will use a limited index scan which will yield the rows in index
order, which will be very fast.

    However, if you just have an index on date, this won't help you.
    In your case, moreover, you don't use release_code_id = constant, but it
comes from a join. So there may be several different values for
release_code_id ; thus the planner can't use the optimization, it has to
find the rows with the release_code_id first. And it can't use the index
on (release_code_id, date) to get the rows in sorted order precisely
because there could be several different values for the release_code_id.
And then it has to sort by date.

    I hope this makes it clearer. If you are absolutely sure there is only
one row in release_code with r.filtered_column = '5', then this means
release_code_id is a constant and your query could get a huge speedup by
writing it differently.

pgsql-performance by date:

Previous
From: "Ken Egervari"
Date:
Subject: Re: Performance problem with semi-large tables
Next
From: "Ken Egervari"
Date:
Subject: Re: Performance problem with semi-large tables