RE: SQL schema and query optimisation for fast cross-table query execution - Mailing list pgsql-sql

From Mike Sofen
Subject RE: SQL schema and query optimisation for fast cross-table query execution
Date
Msg-id 0f9f01d5d381$f8216e00$e8644a00$@runbox.com
Whole thread Raw
In response to SQL schema and query optimisation for fast cross-table query execution  (Brice André <brice@famille-andre.be>)
List pgsql-sql

From: Brice André <brice@famille-andre.be>  Sent: Saturday, January 25, 2020 12:49 AM
wrote:

 

The current DB schema is as follows:
user table : this table records all users registered on the web-site
   - user_id
   - ...
challenge table: this table records all ongoing challenges
   - challenge_id
   - ...
user_challenge_association table : this table records which user is registered to which challenge
   - user_id
   - challenge_id
activity table : this table records all activities of the users
   - user_id
   - date
   - ...
   
a naive implementation of my query could be something like this:

SELECT * FROM activity WHERE user_id IN (SELECT user_id FROM user_challenge_association WHERE challenge_id IN (SELECT challenge_id FROM user_challenge_association WHERE user_id = ||current_user||)) ORDER BY date

Brice

---------------------------------------------------------------

 

Brice, you are correct, “IN” clauses are horrific performers because they turn every member of the IN into an OR, so the larger your IN list, the slower your query runs. 

 

But every IN can be replaced with a join, and that leverages the power of the relational engine.  With standard indexes on the primary and foreign keys, this will be extremely fast and scalable to many hundreds of millions of rows.

 

Taking your query and rewriting it with joins:

SELECT u.user_name, c.challenge_name, a.activity_name, a.activity_date

FROM activity a

Join  user_challenge_association uca on (user_id)

Join challenge c on (challenge_id)

Join user u on (user_id)

Where u.user_id = ||current_user||)) ORDER BY date

 

Mike Sofen

pgsql-sql by date:

Previous
From: Brice André
Date:
Subject: SQL schema and query optimisation for fast cross-table query execution
Next
From: Viral Shah
Date:
Subject: pg_dump fails when a table is in ACCESS SHARE MODE