Re: Performance problem with semi-large tables - Mailing list pgsql-performance

From Ken Egervari
Subject Re: Performance problem with semi-large tables
Date
Msg-id 002201c50654$1a8c50b0$cd422418@a96dfxb4kjzogw
Whole thread Raw
In response to Performance problem with semi-large tables  ("Ken Egervari" <ken@upfactor.com>)
Responses Re: Performance problem with semi-large tables
List pgsql-performance
> Well, postgres does what you asked. It will be slow, because you have a
> full table join. LIMIT does not change this because the rows have to be
> sorted first.

I am aware that limit doesn't really affect the execution time all that
much.  It does speed up ORM though and keeps the rows to a manageable list
so users don't have to look at thousands, which is good enough for me.   My
intention here is that the date was supposed to be a good filter.

> The date is in shipment_status so you should first get the
> shipment_status.id that you need and later join to shipment. This will
> avoid the big join :
>
>
> SELECT s.*, ss.* FROM
> (SELECT * FROM shipment_status WHERE release_code_id IN
> (SELECT r.id FROM release_code WHERE r.filtered_column = '5')
> ORDER BY date DESC LIMIT 100
> ) as ss, shipment s
> WHERE s.current_status_id = ss.id
> ORDER BY date DESC LIMIT 100
>
> Is this better ?

This looks like it might be what I want.  It's not that I was not aware of
the correct join order.  I used Dan Tow's diagram method and learned that
filtering on date first is the best approach, then releae code, then finally
shipment for this particular query.  I just didn't know how to tell
PostgreSQL how to do this.

So are you suggesting as a general rule then that sub-queries are the way to
force a specific join order in postgres?  If that is the case, I will do
this from now on.


pgsql-performance by date:

Previous
From: "Ken Egervari"
Date:
Subject: Re: Performance problem with semi-large tables
Next
From: PFC
Date:
Subject: Re: Performance problem with semi-large tables