Home > mailing lists

Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values - Mailing list pgsql-bugs

From	David Rowley
Subject	Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
Date	May 6, 2021 02:13:04
Msg-id	CAApHDvp89MJGczOeNNypYFJZ2WCRdR7LN=amDzCpeSZk=N1Qtg@mail.gmail.com Whole thread
In response to	Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
List	pgsql-bugs

Tree view

On Thu, 6 May 2021 at 05:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> A narrower fix would be to hack var_eq_non_const so that it doesn't
> assume that the comparison value must be one of the entries in the
> column.  But it seems like whatever change we made in that line would
> be a very unprincipled hack, because what are you going to assume
> instead?

Yeah, this is the same problem as I was mentioning in [1]

My solution was to go for that "unprincipled hack" in var_eq_non_const().

I'm not sure I 100% agree that it's a complete hack, you don't really
have to change the n_distinct by much to get the good plan.  It's a
massive risk to assume that the given value will *always* be the
single distinct value that's indexed.

# SELECT * FROM data_node WHERE EXISTS (SELECT 1 FROM data_entry WHERE
node_fk = data_node.id);
Time: 4807.956 ms (00:04.808)
# alter table data_entry alter column node_fk set (n_distinct = 2);
# analyze data_entry;
# SELECT * FROM data_node WHERE EXISTS (SELECT 1 FROM data_entry WHERE
node_fk = data_node.id);
Time: 3.930 ms

I just feel like it's a huge risk to reject an index path of a column
with 1 distinct value with the assumption that the value that's going
to be looked up *is* that 1 distinct value.  If the index lookup is
done on any of the other 2^64-1 values (in this case) then the index
path would be a *major* win when compared to a seqscan path. The risk
to reward ratio of what we do now is outrageous.

David

[1] https://www.postgresql.org/message-id/CAApHDvpbJHwMZ1U-nzU0kBxu0kwMpBvyL+AFWvFAmurypSo1SQ@mail.gmail.com

pgsql-bugs by date:

From: PG Bug reporting form
Date: 06 May 2021, 00:40:46
Subject: BUG #16995: Need repository key to access old distributions from https://apt-archive.postgresql.org/

From: Tom Lane
Date: 06 May 2021, 03:28:35
Subject: Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values

Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values - Mailing list pgsql-bugs

Previous

Next