Thread: strange nested loop row count estimates
Hi, I'm currently trying to understand the expected row counts for a query involving a nested loop join and bitmap index scan on the functional index and a custom operator. And the numbers that I see don't make sense to me currently. Hopefully somebody here can shed some light on it, or confirm this is some kind of issue. Here is the query and explain analyze explain analyze select * from twomass.psc as t , gaia_dr2.gaia_source as g where ( (q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 0.0003, 0) and q3c_nearby_it(t.ra, t.decl, 0.0003, 1)) or (q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 0.0003, 1) and q3c_nearby_it(t.ra, t.decl, 0.0003, 3)) ) and 0.0003 ==<<>>== (g.ra,g.dec,t.ra,t.decl)::q3c_type limit 10; https://explain.depesz.com/s/vcNd What I can't understand at all is how the estimate of 3E15 rows!!!! is obtained by the nested loop given that the bitmap heap scan is expected to return *one* single row for each row of the 'left' table. So in my mind the estimate of the total number of rows should be ~ 1e9 rows after the nested loop. Because of this crazy overestimate, I actually have to force the nested loop in this query by disabling seqscan. (if I don't disable the seqscan -- this is the plan I get which ignores the indices: https://explain.depesz.com/s/EIiG Some more details about the query: q3c_ang2ipix(ra,dec) is the function mapping (double,double) -> bigint and the tables have a functional index on that. Like this: Table "gaia_dr2.gaia_source" Column | Type | Modifiers ----------------------------------+-------------------+----------- ra | double precision | dec | double precision | ....... Indexes: "gaia_source2_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) q3c_nearby_() function just returns bigint. The ==<<>== is the custom operator with custom low selectivity (1e-12 in this case) The tables in the join in question have 450 mill and 1.5 billion rows. I hope somebody can help me understand what's going on. Thank you in advance. Sergey PS the kind of query that I show comes from the q3c module ( https://github.com/segasai/q3c ) that is used for spatial queries of large astronomical catalogues.
Sergey Koposov <skoposov@cmu.edu> writes: > I'm currently trying to understand the expected row counts for a query involving a nested loop join and bitmap index scan > on the functional index and a custom operator. And the numbers that I see don't make sense to me currently. What sort of selectivity estimator have you got attached to that custom operator? regards, tom lane
On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote: > Sergey Koposov <skoposov@cmu.edu> writes: > > > > I'm currently trying to understand the expected row counts for a query involving a nested loop join and bitmap indexscan > > on the functional index and a custom operator. And the numbers that I see don't make sense to me currently. > What sort of selectivity estimator have you got attached to that custom > operator? This is the code, but basically it is just a constant based on the search radius (which is the leftmost float argument ofthe operator) https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89 For the query in question it should be ~ 1e-12 The whole idea of the operator was to specifically inform PG that this query returns a small number of rows. (the underlying idea of the query is that it does positional crossmatch between datasets on the sphere with a certain smallradius). And obviously the selectivity of this is is extremely tiny).
Sergey Koposov <skoposov@cmu.edu> writes: > On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote: >> What sort of selectivity estimator have you got attached to that custom >> operator? > This is the code, but basically it is just a constant based on the search radius (which is the leftmost float argumentof the operator) > https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89 Hm, that query should be paying attention to join selectivity, and you don't have a join selectivity function. I think that it applies the restriction selectivity while estimating the size of the bitmap scan's output. But that's not what's going to determine the estimated size of the join output. Too tired to look at this really closely, but I think basically the inconsistency boils down to the lack of consistency between your restriction estimator (1e-12) and your join estimator (which, since you haven't got one, is going to default to something way larger, possibly 0.5). regards, tom lane
On Thu, 2019-05-02 at 01:05 -0400, Tom Lane wrote: > Sergey Koposov <skoposov@cmu.edu> writes: > > > > On Thu, 2019-05-02 at 00:36 -0400, Tom Lane wrote: > > > > > > What sort of selectivity estimator have you got attached to that custom > > > operator? > > > > This is the code, but basically it is just a constant based on the search radius (which is the leftmost float argumentof the operator) > > https://github.com/segasai/q3c/blob/361140d4f1f36bf16c9c53721d1c4f03cb4de930/q3c.c#L89 > Hm, that query should be paying attention to join selectivity, and > you don't have a join selectivity function. > > I think that it applies the restriction selectivity while > estimating the size of the bitmap scan's output. But that's not > what's going to determine the estimated size of the join output. > > Too tired to look at this really closely, but I think basically > the inconsistency boils down to the lack of consistency between > your restriction estimator (1e-12) and your join estimator > (which, since you haven't got one, is going to default to > something way larger, possibly 0.5). Thanks very much checking, Tom! Adding the join selectivity estimator fixed the problem. I think I initially tried it, but it wasn't clear whether it was called at all or not. Plus I was confused by the fact that the bitmap scan prediction showed 1 row, so it looked like the selectivity worked. Sergey