Home > mailing lists

SELECT DISTINCT scans the table? - Mailing list pgsql-general

From	Markus Demleitner
Subject	SELECT DISTINCT scans the table?
Date	December 16, 2021 11:00:14
Msg-id	20211216110014.27xcbjfs2rkpjz7t@victor Whole thread
Responses	Re: SELECT DISTINCT scans the table?
List	pgsql-general

Tree view

Dear list,

This feels like a FAQ, but neither the postgres docs nor web searches
got me a single step towards some sort of insight.

Maximally stripped down, my problem is that

  select distinct 300 from <bigtable>

seqscans <bigtable> (at least in PostgreSQL 11.14).  To me, it seems
obvious that this ought be be just one row containing 300 once
Postgres has established that <bigtable> is nonempty.

Why do things not work like this?  Am I missing something major?

The reason I'm interested in this is of course a bit more involved.
I have a view that looks somewhat like this:

  CREATE VIEW a_view AS (
    SELECT 'abc' as coll, ...
    FROM table1
  UNION
    SELECT 'def' as coll, ...
    FROM table2
  UNION
    SELECT coll, ...
    FROM table3
  ...)

and so on for perhaps 50 tables; where, as for table3 in this
example, the coll column is not simply a constant, there is an index
on the source column (these then are actually fast).

I'd now like to be able to say

  SELECT DISTINCT coll FROM a_view

-- which takes forever once some of the tables involved are
sufficiently large.

I'd assume in an ideal world the query would be essentally
instantaneous -- merging the constants and doing a few index scans
for the tables that have non-constant coll.  In reality, it's not.
The tables with constant coll are all completely scanned, either
sequentially or index-only.

Assuming this is the expected behaviour: Is there perhaps some trick
I could use to make postgres use the information that there are
constants in the select clauses?

Thanks,

         Markus

pgsql-general by date:

From: Matt Magoffin
Date: 16 December 2021, 02:11:12
Subject: Re: Properly handling aggregate in nested function call

From: Bryn Llewellyn
Date: 16 December 2021, 19:21:56
Subject: Re: Why can't I have a "language sql" anonymous block?

SELECT DISTINCT scans the table? - Mailing list pgsql-general

Previous

Next