Thread: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16993 Logged by: A Schnabl Email address: a.schnabl@synedra.com PostgreSQL version: 13.2 Operating system: Windows 10 Description: Hello, I can reproduce a query planner problem on a fresh default installation of postgres-13.2 on my Windows 10 machine. I also saw it occur on on postgres-12.6 installations on Windows 10 and CentOS 8. I used the following table setup: CREATE TABLE data_node (id BIGINT PRIMARY KEY); CREATE TABLE data_entry (id BIGINT PRIMARY KEY, node_fk BIGINT NOT NULL, FOREIGN KEY (node_fk) REFERENCES data_node (id)); CREATE INDEX node_ix ON data_entry USING BTREE (node_fk); INSERT INTO data_node (id) SELECT generate_series(1,10); INSERT INTO data_entry (id, node_fk) SELECT s, 2 FROM generate_series(1,10000000) s; VACUUM ANALYZE data_node; VACUUM ANALYZE data_entry; With the following query, I filter all entries of the data_node table (only a single with the data given here) which are referenced by the data_entry table: SELECT * FROM data_node WHERE EXISTS ( SELECT 1 FROM data_entry WHERE node_fk = data_node.id ); The query takes a few seconds to complete, although I expect it to be much faster (as shown below, it can be performed in less than a millisecond) due to the node_ix index. EXPLAIN ANALYZE shows the following query plan: Merge Join (cost=179053.41..179054.24 rows=1 width=8) (actual time=2140.820..2140.822 rows=1 loops=1) Merge Cond: (data_node.id = data_entry.node_fk) -> Index Only Scan using data_node_pkey on data_node (cost=0.14..8.29 rows=10 width=8) (actual time=0.006..0.010 rows=3 loops=1) Heap Fetches: 0 -> Sort (cost=179053.27..179053.28 rows=1 width=8) (actual time=2140.805..2140.806 rows=1 loops=1) Sort Key: data_entry.node_fk Sort Method: quicksort Memory: 25kB -> HashAggregate (cost=179053.25..179053.26 rows=1 width=8) (actual time=2140.796..2140.796 rows=1 loops=1) Group Key: data_entry.node_fk Batches: 1 Memory Usage: 24kB -> Seq Scan on data_entry (cost=0.00..154053.60 rows=9999860 width=8) (actual time=0.036..973.376 rows=10000000 loops=1) Planning Time: 0.192 ms Execution Time: 2140.873 ms Note that a sequential scan scan on data_entry is used here rather than some index (only) scans on node_ix. If I SET enable_seqscan = false; and EXPLAIN ANALYZE the query again, I get the following result: Nested Loop Semi Join (cost=0.57..12.85 rows=1 width=8) (actual time=0.877..0.920 rows=1 loops=1) -> Index Only Scan using data_node_pkey on data_node (cost=0.14..8.29 rows=10 width=8) (actual time=0.006..0.011 rows=10 loops=1) Heap Fetches: 0 -> Index Only Scan using node_ix on data_entry (cost=0.43..178143.98 rows=9999860 width=8) (actual time=0.089..0.089 rows=0 loops=10) Index Cond: (node_fk = data_node.id) Heap Fetches: 0 Planning Time: 0.328 ms Execution Time: 0.959 ms Due to the huge performance gap between the plans, I think that the query planner should use the node_ix index here even without enable_seqscan being set to false.
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > ... EXPLAIN ANALYZE shows the following query plan: > Merge Join (cost=179053.41..179054.24 rows=1 width=8) (actual > time=2140.820..2140.822 rows=1 loops=1) > [ but with ] SET enable_seqscan = false: > Nested Loop Semi Join (cost=0.57..12.85 rows=1 width=8) (actual > time=0.877..0.920 rows=1 loops=1) Yeah, that seems *extremely* broken. The planner correctly estimates the second plan as much cheaper, so why did it fail to choose that one without help? Will look into this. regards, tom lane
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > CREATE TABLE data_node (id BIGINT PRIMARY KEY); > CREATE TABLE data_entry (id BIGINT PRIMARY KEY, node_fk BIGINT NOT NULL, > FOREIGN KEY (node_fk) REFERENCES data_node (id)); > CREATE INDEX node_ix ON data_entry USING BTREE (node_fk); > INSERT INTO data_node (id) SELECT generate_series(1,10); > INSERT INTO data_entry (id, node_fk) SELECT s, 2 FROM > generate_series(1,10000000) s; > VACUUM ANALYZE data_node; > VACUUM ANALYZE data_entry; > ... > SELECT * FROM data_node > WHERE EXISTS ( > SELECT 1 FROM data_entry WHERE node_fk = data_node.id > ); I looked into this. The reason for the strange behavior is that, thanks to the extremely skewed data distribution in data_entry.node_fk, the indexscan you're hoping it would use is actually costed as being more expensive than a seqscan. That causes it to get discarded at the relation planning level. Later on, when we consider join plans, there's a special exception that understands that a semijoin with an inner indexscan that uses all the join quals is particularly efficient; which is why the desirable plan is correctly given a small cost, *if* we consider it at all. But if the indexscan was already thrown away, we won't. Of course, disabling the seqscan allows the indexscan to survive the scan-level path tournament. I'm not sure what a good fix for this would look like. We could imagine trying to mark indexscan paths that satisfy has_indexed_join_quals() at time of generation, and then giving them a free pass in add_path(). However (1) as written, has_indexed_join_quals() depends on the particular join being considered. That might be just an artifact of coding convenience, but I'm not sure. (2) add_path() is enough of a hot spot that giving it yet more things to worry about is fairly unattractive. (3) allowing more paths to survive the tournament slows things down overall. Tweaking the example shows that you need a *very* skewed data distribution to make this happen, so maybe we should write it off as a corner case that's not worth fixing. I'm for sure not eager to slow planning down overall to improve this case, so I don't much like the above proposal even if it could be made to work. A different idea is to reconsider the selectivity estimation rules with an eye to cutting the estimated cost of the indexscan. The thing that I'm wondering about here is that we're estimating that using eqsel() rather than eqjoinsel(), and it seems like the latter might give us a more useful value. AFAIR we never thought hard about that aspect when we invented parameterized plans. But for sure, changing that would be a research project with a lot of potential impact. A narrower fix would be to hack var_eq_non_const so that it doesn't assume that the comparison value must be one of the entries in the column. But it seems like whatever change we made in that line would be a very unprincipled hack, because what are you going to assume instead? Anyway, short answer is that this is more complicated than it looks, and there isn't any simple improvement we can make. regards, tom lane
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
David Rowley
Date:
On Thu, 6 May 2021 at 05:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: > A narrower fix would be to hack var_eq_non_const so that it doesn't > assume that the comparison value must be one of the entries in the > column. But it seems like whatever change we made in that line would > be a very unprincipled hack, because what are you going to assume > instead? Yeah, this is the same problem as I was mentioning in [1] My solution was to go for that "unprincipled hack" in var_eq_non_const(). I'm not sure I 100% agree that it's a complete hack, you don't really have to change the n_distinct by much to get the good plan. It's a massive risk to assume that the given value will *always* be the single distinct value that's indexed. # SELECT * FROM data_node WHERE EXISTS (SELECT 1 FROM data_entry WHERE node_fk = data_node.id); Time: 4807.956 ms (00:04.808) # alter table data_entry alter column node_fk set (n_distinct = 2); # analyze data_entry; # SELECT * FROM data_node WHERE EXISTS (SELECT 1 FROM data_entry WHERE node_fk = data_node.id); Time: 3.930 ms I just feel like it's a huge risk to reject an index path of a column with 1 distinct value with the assumption that the value that's going to be looked up *is* that 1 distinct value. If the index lookup is done on any of the other 2^64-1 values (in this case) then the index path would be a *major* win when compared to a seqscan path. The risk to reward ratio of what we do now is outrageous. David [1] https://www.postgresql.org/message-id/CAApHDvpbJHwMZ1U-nzU0kBxu0kwMpBvyL+AFWvFAmurypSo1SQ@mail.gmail.com
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Thu, 6 May 2021 at 05:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> A narrower fix would be to hack var_eq_non_const so that it doesn't >> assume that the comparison value must be one of the entries in the >> column. But it seems like whatever change we made in that line would >> be a very unprincipled hack, because what are you going to assume >> instead? > Yeah, this is the same problem as I was mentioning in [1] > My solution was to go for that "unprincipled hack" in var_eq_non_const(). > I'm not sure I 100% agree that it's a complete hack, you don't really > have to change the n_distinct by much to get the good plan. It's a > massive risk to assume that the given value will *always* be the > single distinct value that's indexed. Yeah, I agree that it doesn't seem great to let var_eq_non_const return 1.0 when it has no idea what the comparison value is. However, that doesn't translate to having much confidence in any other value either. The actual number-of-rows-fetched, given that we know the column contents are all the same value, is either zero or the whole table. It's hard to do much with that; and biasing it towards believing the optimistic value over the pessimistic value seems dangerous. In any case, I think the real issue here is that we know that *in use*, the indexscan will fetch either zero or one row, and be pretty quick either way. The problem is that that knowledge is being applied at the join level, which is too late to save the path from losing. How could we move that knowledge down to the scan path costs? regards, tom lane
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
David Rowley
Date:
On Thu, 6 May 2021 at 15:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah, I agree that it doesn't seem great to let var_eq_non_const return > 1.0 when it has no idea what the comparison value is. However, that > doesn't translate to having much confidence in any other value either. > The actual number-of-rows-fetched, given that we know the column > contents are all the same value, is either zero or the whole table. > It's hard to do much with that; and biasing it towards believing the > optimistic value over the pessimistic value seems dangerous. Maybe we should be also looking at var_eq_non_const()'s 'other' field and seeing if that's a Var with stats and dividing the results by the number of distinct values in that, or at least doing something closer to what eqjoinsel_inner() does when it's prompted with the same problem. David
Re: BUG #16993: Query Planner does not use index for EXISTS-query on a large table with all equal foreign key values
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Thu, 6 May 2021 at 15:28, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, I agree that it doesn't seem great to let var_eq_non_const return >> 1.0 when it has no idea what the comparison value is. However, that >> doesn't translate to having much confidence in any other value either. >> The actual number-of-rows-fetched, given that we know the column >> contents are all the same value, is either zero or the whole table. >> It's hard to do much with that; and biasing it towards believing the >> optimistic value over the pessimistic value seems dangerous. > Maybe we should be also looking at var_eq_non_const()'s 'other' field > and seeing if that's a Var with stats and dividing the results by the > number of distinct values in that, or at least doing something closer > to what eqjoinsel_inner() does when it's prompted with the same > problem. Yeah, as I mentioned upthread, it's tempting to think about applying join selectivity rules when dealing with a parameterized path's restriction-qual-that's-really-a-join-qual. I'm not sure if that selectivity is expressed the right way as things stand (i.e., fraction of the Cartesian product size isn't the same thing as fraction of the inner relation). In any case that seems like a bit of a research problem. regards, tom lane