Re: Controlling complexity in queries - Mailing list pgsql-general

From Jay Levitt
Subject Re: Controlling complexity in queries
Date
Msg-id 4EE97375.4080203@gmail.com
Whole thread Raw
In response to Re: Controlling complexity in queries  (Alban Hertroys <haramrae@gmail.com>)
List pgsql-general
Alban Hertroys wrote:
>> select questions.id from questions
>> join (
>>  select u.id from users as u
>>  group by u.id
>> ) as s
>> on s.id = questions.user_id
>> where questions.id = 1;

> You could write that as:
>
> select questions.id
> from questions as q
> where exists (select 1 from users as u where u.id = q.user_id)
> and questions.id = 1;
>
> That's basically what you are doing, checking that a user with a given id
 > from the questions table exists in the users table.
 >
 > That said, wouldn't a foreign key constraint help you even better? If
 > questions.user_id is required to refer to an existing users.id (by an FK
 > constraint), than the check in the query becomes moot.

Ahh, I see.. yes, this query is just the smallest possible query that
exhibits the same not-using-the-index behavior as the real query, which
needs columns from both questions and users, and thus needs the join.  (And
it has aggregates, and needs the GROUP BY too.) There already is a
constraint, questions.user_id always refers to a real users.id, etc.

This is actually a great case where relational thinking does NOT map well to
functional composability; as Tom Lane pointed out, the solution is just "add
the WHERE clause to the subquery too."  But the subquery is in a function
that doesn't *know* it's being restricted, and (to me) shouldn't have to
know; that's what the optimizer does for a living.

FWIW, and this may help the OP, my plan for tackling the "but I want
readability AND performance" issue is to

1. write a monolithic, optimized, incomprehensible version of the query
2. maintain the pretty functions alongside it
3. Write unit tests that confirm that the output of #1 and #2 is identical.

Kinda like how gcc builds gcc and verifies that the output is the same as
gcc building gcc building gcc.

Jay

pgsql-general by date:

Previous
From: Edson Richter
Date:
Subject: Re: Philosophical question
Next
From: Chris Angelico
Date:
Subject: Re: Philosophical question