Re: Performance of "distinct with limit" - Mailing list pgsql-general

From Klaudie Willis
Subject Re: Performance of "distinct with limit"
Date
Msg-id KK_HX4z5BF8lq32xzIFQnnBgcFgV9m7egEHQc0EHzfhhBw4Ir5TTqlE-MLbD7-C_WXY6wwzx2dKRU6LG6TY2r80zgWjhe5iyeclfMIRXdls=@protonmail.com
Whole thread Raw
In response to Re: Performance of "distinct with limit"  (luis.roberto@siscobra.com.br)
Responses Re: Performance of "distinct with limit"
List pgsql-general
No index on n, no. Index might solve it yes, but it seems to me such a trivial optimization even without.  Obviously it is not.

QUERY PLAN                                                                        |
----------------------------------------------------------------------------------|
Limit  (cost=1911272.10..1911272.12 rows=2 width=7)                               |
  ->  HashAggregate  (cost=1911272.10..1911282.45 rows=1035 width=7)              |
        Group Key: cfi                                                            |
        ->  Seq Scan on bigtable  (cost=0.00..1817446.08 rows=37530408 width=7)|



Klaudie

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, August 28, 2020 1:59 PM, <luis.roberto@siscobra.com.br> wrote:

Hi, 

If "n" is indexed, it should run quickly. Can you share the execution plan for your query?




De: "Klaudie Willis" <Klaudie.Willis@protonmail.com>
Para: "pgsql-general" <pgsql-general@lists.postgresql.org>
Enviadas: Sexta-feira, 28 de agosto de 2020 8:29:58
Assunto: Performance of "distinct with limit"

Hi,

Ran into this under-optimized query execution.

select distinct n from bigtable;   -- Lets say this takes 2 minutes
select distinct n from bigtable limit 2  -- This takes approximately the same time

However, the latter should have the potential to be so much quicker.  I checked the same query on MSSQL (with 'top 2'), and it seems to do exactly the optimization I would expect. 

Is there any way to achieve a similar speedup in Postgresql?

Klaudie



pgsql-general by date:

Previous
From: luis.roberto@siscobra.com.br
Date:
Subject: Re: Performance of "distinct with limit"
Next
From: Thorsten Schöning
Date:
Subject: How to properly query lots of rows based on timestamps?