Home > mailing lists

Re: Functionally dependent columns in SELECT DISTINCT - Mailing list pgsql-general

From	Willow Chargin
Subject	Re: Functionally dependent columns in SELECT DISTINCT
Date	September 13 21:26:53
Msg-id	CAALRJs5QkE-yWVdxQxOLXbM8DyPVV3M_=zfsPK17SV-zHQsASQ@mail.gmail.com Whole thread Raw
In response to	Re: Functionally dependent columns in SELECT DISTINCT (shammat@gmx.net)
Responses	Re: Functionally dependent columns in SELECT DISTINCT
List	pgsql-general

Tree view

On Thu, Sep 12, 2024 at 11:13 PM <shammat@gmx.net> wrote:
>
> What about using DISTINCT ON () ?
>     SELECT DISTINCT ON (items.id) items.*
>     FROM items
>       JOIN parts ON items.id = parts.item_id
>     WHERE part_id % 3 = 0
>     ORDER BY items.id,items.create_time DESC
>     LIMIT 5;
>
> This gives me this plan: https://explain.depesz.com/s/QHr6 on 16.2  (Windows, i7-1260P)

Ordering by items.id changes the answer, though. In the example I gave,
items.id and items.create_time happened to be in the same order, but
that needn't hold. In reality I really do want the ID columns of the
*most recent* items.

You can see the difference if you build the test dataset a bit
differently:

    INSERT INTO items(id, create_time)
        SELECT i, now() - make_interval(secs => random() * 1e6)
        FROM generate_series(1, 1000000) s(i);

We want the returned create_times to be all recent, and the IDs now
should look roughly random.

pgsql-general by date:

From: Adrian Klaver
Date: 13 September, 20:57:28
Subject: Re: Manual query vs trigger during data load

From: "Wong, Kam Fook (TR Technology)"
Date: 13 September, 21:34:55
Subject: Will hundred of thousands of this type of query cause Parsing issue

Re: Functionally dependent columns in SELECT DISTINCT - Mailing list pgsql-general

Previous

Next