Re: SELECT DISTINCT scans the table? - Mailing list pgsql-general

From Markus Demleitner
Subject Re: SELECT DISTINCT scans the table?
Date
Msg-id 20211217091937.mmrvld6cgnteybdb@victor
Whole thread Raw
In response to Re: SELECT DISTINCT scans the table?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Dear Tom,

On Thu, Dec 16, 2021 at 03:47:57PM -0500, Tom Lane wrote:
> Markus Demleitner <msdemlei@ari.uni-heidelberg.de> writes:
> > Maximally stripped down, my problem is that
> 
> >   select distinct 300 from <bigtable>
> 
> > seqscans <bigtable> (at least in PostgreSQL 11.14).  To me, it seems
> > obvious that this ought be be just one row containing 300 once
> > Postgres has established that <bigtable> is nonempty.
> 
> That seems like the sort of optimization that we absolutely should
> not spend cycles looking for.  If it were a trivial change consuming
> no detectable number of planning cycles, maybe it would be worth the
> development and maintenance effort; though I'd be dubious about the
> value.  But the fact that it'd have to be transformed into something
> testing whether the table is nonempty makes it fairly nontrivial.
> I doubt it's worth the development cost plus the cycles imposed
> on every other query.

I certainly understand that reasoning for this particular example.
However, in my actual use case, the one with the view consisting of a
large union containing constants from the original mail, 

  CREATE VIEW a_view AS (
    SELECT 'abc' as coll, ...
    FROM table1   -- with perhaps 1e6 rows
  UNION
    SELECT 'def' as coll, ...
    FROM table2   -- with perhaps another 1e7 rows
  UNION
    SELECT coll, ... -- with an index on table3.coll
    FROM table3
  ...)

being able to factor out constants would make a difference of
milliseconds versus a long time (~ a minute in my case, with about
1e8 total rows) when running SELECT DISTINCT coll FROM a_view.

Is there, perhaps, a non-obvious way to give the planner a nudge to
exploit the constant-ness of coll in table1 and table2?

Thanks,

           Markus



pgsql-general by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
Next
From: "holistic.dev"
Date:
Subject: Reg. static SQL code analysis tool for PostgreSQL