Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem - Mailing list pgsql-general

From Tom Lane
Subject Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem
Date
Msg-id 16772.1188660207@sss.pgh.pa.us
Whole thread Raw
In response to Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem  ("Phoenix Kiula" <phoenix.kiula@gmail.com>)
List pgsql-general
"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> On 01/09/07, Alban Hertroys <alban@magproductions.nl> wrote:
>> Is 10 ms problematic for this query?

> I think you got 10ms from the query plan? These queries are very fast
> after they have been executed once. But the first time is huge.
> Sometimes I have to wait as much as 10 seconds (10,000ms?)

It's to be expected that repeating the same query would be faster, since
all the data will have been pulled from disk and be sitting in cache.
In this query you're fetching about 700 rows from random locations on
the disk, so if none of them are in memory already there's likely to be
700 seeks done.  Seek times in the range of 10ms are not unusual for
cheap disks ... you do the math.

Solutions include buying faster disks, or buying more RAM so more of
your data can stay in cache.  If your queries are very stylized (like
always using the same index) then you might get somewhere by CLUSTERing
on that index to reduce the number of seeks needed, but this is seldom
a solution that fixes everything.

>>> So my question: how can I do a left
>>> join, which is the logic that I wish to accomplish, but get the
>>> sorting to work from the second table and if a column is null then
>>> just return as 0 instead of nothing at all? (The LEFT JOIN used to
>>> work in Mysql).

>> You could use ORDER BY COALESCE(tradecount.u_count, 0) desc if you
>> want it to behave like you say mysql sorted it.

I got curious about this assertion and went to check it.  AFAICT mysql
doesn't have any weird automatic coalesce involved in sorting.  The
difference is that they sort nulls first, rather than last as we do:

mysql> select * from t1 left join t2 using(f1) order by t2.f2 ;
+----+------+
| f1 | f2   |
+----+------+
|  3 | NULL |
|  1 |   11 |
|  2 |   22 |
+----+------+
3 rows in set (0.00 sec)

mysql> select * from t1 left join t2 using(f1) order by t2.f2 desc;
+----+------+
| f1 | f2   |
+----+------+
|  2 |   22 |
|  1 |   11 |
|  3 | NULL |
+----+------+
3 rows in set (0.00 sec)

Same data in PG yields:

regression=# select * from t1 left join t2 using(f1) order by t2.f2 ;
 f1 | f2
----+----
  1 | 11
  2 | 22
  3 |
(3 rows)

regression=# select * from t1 left join t2 using(f1) order by t2.f2 desc;
 f1 | f2
----+----
  3 |
  2 | 22
  1 | 11
(3 rows)

Both behaviors are legal per spec (it's "implementation defined"
which is the ordering, according to the SQL standard).

As of PG 8.3 there will be NULLS FIRST and NULLS LAST options so that
you can get either ordering, but no released version has these:

regression=# select * from t1 left join t2 using(f1) order by t2.f2 nulls first;
 f1 | f2
----+----
  3 |
  1 | 11
  2 | 22
(3 rows)

            regards, tom lane

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: JOIN issues (Left vs Right for sorting), and "Nested Loop" problem
Next
From: Richard Broersma Jr
Date:
Subject: Re: Export data to MS Excel