[NOVICE] Queries with Joins before filtering taking too much time! Filter(where clause) *first* -- suggestions ? - Mailing list pgsql-novice

From Hursh Jain
Subject [NOVICE] Queries with Joins before filtering taking too much time! Filter(where clause) *first* -- suggestions ?
Date
Msg-id 587C09C1.20508@gmail.com
Whole thread Raw
Responses [NOVICE] Re: Queries with Joins before filtering taking too much time! Filter(where clause) *first* -- suggestions ?  (Hursh Jain <hurshj@gmail.com>)
List pgsql-novice
Hi:

Postgres 9.6.1 on a large linux box.

I have 2 tables under consideration:

1. property (about 800,000 rows)
This has a attribute called current_owner which is FK pointing to uid in
users table. This attribute can be null.

2. users (about 400 rows)
This has users in the system, each user has a uid.

=====
Query:

SELECT
*
FROM
property p
LEFT OUTER JOIN users ON (p.current_owner = users.uid)
WHERE
p.pid in
(SELECT pid FROM reward WHERE reward_type = 'daily'
ORDER BY reward_date DESC LIMIT 30)
;


This query takes a looong time. I tried an EXPLAIN ANALYSE and this
is what I got:

QUERY PLAN
------------------------------------------------------------------------------

Merge Cond: (users.uid = p.current_owner)
-> Index Scan using pk_users on users (cost=0.27..393.66 rows=414
width=198) (actual time=0.007..0.007 rows=1 loops=1)
-> Index Scan using idx_property_current_owner on property p
(cost=0.43..9889803.81 rows=3201653 width=775) (actual
time=10785.283..10785.2..83 rows=0 loops=1)
Filter: (SubPlan 1)
Rows Removed by Filter: 760651
SubPlan 1
-> Limit (cost=1.31..1.32 rows=1 width=16) (actual time=0.011..0.011
rows=0 loops=760651)
-> Sort (cost=1.31..1.32 rows=1 width=16) (actual time=0.008..0.008
rows=0 loops=760651)
Sort Key: reward.reward_date DESC
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_reward_reward_type on reward (cost=0.29..1.30
rows=1 width=16) (actual time=0.005..0.005 rows=0.. loops=760651)
Index Cond: (reward_type = 'daily'::text)
Planning time: 5.575 ms
Execution time: 10785.510 ms
---------------------------------

You can see the index scan on property is done first, like so:

Index Scan using idx_property_current_owner on property p
(cost=0.43..9889803.81 rows=3201653 width=775)
(actual time=10785.283..10785.2..83 rows=0 loops=1)

This take 10+ seconds. But this is wasted, since the number of property
rows are then filtered to less than 20 later on (via the WHERE clause).

Any suggestions on how to do the filtering FIRST and then do the join ?
I need to left outer join with users since the user (current_owner)
might be null.

Best,
--J



pgsql-novice by date:

Previous
From: Tom Lane
Date:
Subject: Re: [NOVICE] REVOKE SELECT column on FOREIGN TABLE
Next
From: Hursh Jain
Date:
Subject: [NOVICE] Re: Queries with Joins before filtering taking too much time! Filter(where clause) *first* -- suggestions ?