JOIN issues (Left vs Right for sorting), and "Nested Loop" problem - Mailing list pgsql-general

From Phoenix Kiula
Subject JOIN issues (Left vs Right for sorting), and "Nested Loop" problem
Date
Msg-id e373d31e0709010246qa72cd21ne8ca167380b0c1bf@mail.gmail.com
Whole thread Raw
Responses Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem
List pgsql-general
Hello,

I have a simple query as follows. It joins two very straightforward tables.


SELECT
  trades.id,
  trades.url,
  trades.alias,
  tradecount.t_count,
  tradecount.u_count
FROM trades
LEFT JOIN tradecount ON trades.id = tradecount.id
WHERE trades.user_id = 'jondoe' and trades.status = 'Y'
ORDER BY
  tradecount.u_count desc
OFFSET 20 LIMIT 10


Both the tables have a bigint "id" field that connects them. The table
definitions are included below:




                                Table "public.trades"

        Column         |            Type             |          Modifiers
-----------------------+-----------------------------+------------------------------
 id                    | bigint                      | not null
 user_id               | character varying(45)       | not null
 url                   | text                        | not null
 alias                 | character varying(20)       | not null
 title                 | character varying(500)      |
 private               | character(1)                |
 status                | character(1)                | default 'Y'::bpchar
 modify_date           | timestamp without time zone |
 disable_in_statistics | character(1)                | not null
default 'N'::bpchar
Indexes:
    "trades_pkey" PRIMARY KEY, btree (id)
    "trades_unique_alias" UNIQUE, btree (alias)
    "idx_trades_mdate" btree (modify_date)
    "idx_trades_userid" btree (user_id)
Check constraints:
    "trades_alias_valid" CHECK (alias::text ~ '[-A-Za-z0-9_]'::text)
    "trades_id_check" CHECK (id > 0)
    "trades_url_check" CHECK (url <> ''::text)
    "trades_user_id_check" CHECK (user_id::text <> ''::text)





                    Table "public.tradecount"

    Column    |            Type             |     Modifiers
--------------+-----------------------------+--------------------
 id           | bigint                      | not null
 t_count  | integer                     | not null default 0
 u_count | integer                     | not null default 0
 modify_date  | timestamp without time zone | default now()
Indexes:
    "tradecount_pkey" PRIMARY KEY, btree (id)
    "i_tradecount_uc" btree (u_count)
    "i_tradecount_vc" btree (t_count)
Foreign-key constraints:
    "fk_tradecount_trades_id" FOREIGN KEY (id) REFERENCES trades(id)
ON DELETE CASCADE
Rules:
    replace_tradecount_on_duplicate_insert AS
    ON INSERT TO tradecount
   WHERE (EXISTS ( SELECT 1
           FROM tradecount
          WHERE tradecount.id = new.id)) DO INSTEAD  UPDATE tradecount
SET t_count = tradecount.t_count, u_count = tradecount.u_count
  WHERE tradecount.id = new.id




Now I have two problems:


1. The above query takes more time to fire up that an index should
really take. I have bitmap heap scan off in conf file, and indexscan
on, otherwise this was going into a bitmap heap thing.

As you will see from the SQL above, the trades.user_id index should be
limiting the number of rows to a few hundred (or thousand at max) and
then we are trying to get only 10 tuples based on the OFFSET and LIMIT
clauses.

However, there's a nested loop in there as the EXPLAIN ANALYZE shows
below. What is causing this nested loop?



QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4829.70..4829.73 rows=10 width=125) (actual
time=9.784..9.835 rows=10 loops=1)
   ->  Sort  (cost=4829.65..4830.61 rows=385 width=125) (actual
time=9.703..9.757 rows=30 loops=1)
         Sort Key: tradecount.u_count
         ->  Nested Loop Left Join  (cost=0.00..4813.12 rows=385
width=125) (actual time=0.075..8.662 rows=386 loops=1)
               ->  Index Scan using idx_trades_userid on trades
(cost=0.00..1556.08 rows=385 width=117) (actual time=0.05
0..1.225 rows=386 loops=1)
                     Index Cond: ((user_id)::text = 'jondoe'::text)
                     Filter: (status = 'Y'::bpchar)
               ->  Index Scan using tradecount_pkey on tradecount
(cost=0.00..8.45 rows=1 width=16) (actual time=0.006.
.0.008 rows=1 loops=386)
                     Index Cond: (trades.id = tradecount.id)
 Total runtime: 9.963 ms
(10 rows)




2. Secondly, if I want to sort the join by a column on the second
table, then the rows returned are not really sorted unless I do a
RIGHT JOIN (my sql above shows a LEFT JOIN). Getting results from a
right join is fine as long as the column is not null in the second
table, but if it is null, then nothing is returned. This is why I do a
LEFT join in the first place! So my question: how can I do a left
join, which is the logic that I wish to accomplish, but get the
sorting to work from the second table and if a column is null then
just return as 0 instead of nothing at all? (The LEFT JOIN used to
work in Mysql).


TIA for any thoughts!

pgsql-general by date:

Previous
From: "Phoenix Kiula"
Date:
Subject: Re: Export data to MS Excel
Next
From: Ron Johnson
Date:
Subject: Re: Export data to MS Excel