Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY - Mailing list pgsql-general

From Olivier Gautherot
Subject Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY
Date
Msg-id CAJ7S9TV2=_wBe1k4Pn8Pmadw7WBD-SE9u5+iwT3OsXCRVri1hg@mail.gmail.com
Whole thread Raw
In response to Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY  (Sean v <sean@vanmulligen.ca>)
List pgsql-general


El mié, 7 feb 2024 8:07, Sean v <sean@vanmulligen.ca> escribió:

Exactly. I'm really just trying to understand if there's some functional limitation to it being able to do that with how it executes these types of queries, or if its just an optimization that hasn't been built into the query planner yet.

I know I can get it to do precisely this if I use a CROSS JOIN LATERAL:

SELECT o.*
FROM company_users cu
CROSS JOIN LATERAL (   SELECT *   FROM orders o   WHERE o.user_id = company_users.user_id   ORDER  BY created_at DESC LIMIT 50   ) cu
WHERE  cu.company_id = ? 
ORDER BY created_at DESC LIMIT 50

That makes sense to me, it forces a nested loop and executes for each user. But doing a nested select like the query below doesn't use the index or limit the results to 50 per user - even though it does a nested loop just like the lateral join does:

SELECT "orders".* 
FROM "orders" 
WHERE user_id IN (SELECT user_id FROM company_users WHERE company_id = ?)
ORDER BY "orders"."created_at" LIMIT 50

Joins will generally query the whole tables, leading to long run times. Have you tried to preselect the rows of interest with a "WITH ... SELECT ..." query to reduce the amount of data processed?


On 2024-02-05 7:58 a.m., David G. Johnston wrote:
On Mon, Feb 5, 2024 at 8:55 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:

Who knows which users are going to be in that list???


It doesn't matter.  Worse case scenario there is only one user in the result and so all 50 rows are their earliest 50 rows.  The system will thus never need more than the earliest 50 rows per user to answer this question.

David J.

Cheers
Olivier

pgsql-general by date:

Previous
From: Sean v
Date:
Subject: Re: Why doesn't Postgres apply limit on groups when retrieving N results per group using WHERE IN + ORDER BY
Next
From: Carsten Klein
Date:
Subject: Generic File Access Function to read program output