Thread: why doesn't optimizer can pull up where a > ( ... )
Hi Hackers:
First I found the following queries running bad on pg.
select count(*) from part2 p1 where p_size > 40 and p_retailprice > (select avg(p_retailprice) from part2 p2 where p2.p_brand=p1.p_brand);
the plan is
QUERY PLAN
------------------------------------------------------------------------------------
Aggregate (cost=1899310537.28..1899310537.29 rows=1 width=8)
-> Seq Scan on part2 p1 (cost=0.00..1899310456.00 rows=32513 width=0)
Filter: ((p_size > 40) AND (p_retailprice > (SubPlan 1)))
SubPlan 1
-> Aggregate (cost=6331.00..6331.01 rows=1 width=32)
-> Seq Scan on part2 p2 (cost=0.00..5956.00 rows=150000 width=4)
Filter: (p_brand = p1.p_brand)
------------------------------------------------------------------------------------
Aggregate (cost=1899310537.28..1899310537.29 rows=1 width=8)
-> Seq Scan on part2 p1 (cost=0.00..1899310456.00 rows=32513 width=0)
Filter: ((p_size > 40) AND (p_retailprice > (SubPlan 1)))
SubPlan 1
-> Aggregate (cost=6331.00..6331.01 rows=1 width=32)
-> Seq Scan on part2 p2 (cost=0.00..5956.00 rows=150000 width=4)
Filter: (p_brand = p1.p_brand)
however if we change it to the following format, it runs pretty quick.
select count(*) from part2,
(select p_brand, avg(p_retailprice) as avg_price from part2 where p_size > 40 group by p_brand) p2
where p_retailprice > p2.avg_price
and p_size > 40
and part2.p_brand = p2.p_brand;
(select p_brand, avg(p_retailprice) as avg_price from part2 where p_size > 40 group by p_brand) p2
where p_retailprice > p2.avg_price
and p_size > 40
and part2.p_brand = p2.p_brand;
The above example comes from https://community.pivotal.io/s/article/Pivotal-Query-Optimizer-Explained with a litter modification.
1. why pg can't translate the query 1 to query 2. after some checking on pull_up_sublinks_qual_recurse, I still doesn't get the idea.
2. why pg can't do it, while greenplum can?
Thanks
On Wed, Nov 20, 2019 at 8:15 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
Hi Hackers:First I found the following queries running bad on pg.select count(*) from part2 p1 where p_size > 40 and p_retailprice > (select avg(p_retailprice) from part2 p2 where p2.p_brand=p1.p_brand);the plan isQUERY PLAN
------------------------------------------------------------------------------------
Aggregate (cost=1899310537.28..1899310537.29 rows=1 width=8)
-> Seq Scan on part2 p1 (cost=0.00..1899310456.00 rows=32513 width=0)
Filter: ((p_size > 40) AND (p_retailprice > (SubPlan 1)))
SubPlan 1
-> Aggregate (cost=6331.00..6331.01 rows=1 width=32)
-> Seq Scan on part2 p2 (cost=0.00..5956.00 rows=150000 width=4)
Filter: (p_brand = p1.p_brand)however if we change it to the following format, it runs pretty quick.select count(*) from part2,
(select p_brand, avg(p_retailprice) as avg_price from part2 where p_size > 40 group by p_brand) p2
where p_retailprice > p2.avg_price
and p_size > 40
and part2.p_brand = p2.p_brand;The above example comes from https://community.pivotal.io/s/article/Pivotal-Query-Optimizer-Explained with a litter modification.1. why pg can't translate the query 1 to query 2. after some checking on pull_up_sublinks_qual_recurse, I still doesn't get the idea.2. why pg can't do it, while greenplum can?Thanks
add the sql I used for testing for reference.
p_partkey integer NOT NULL,
p_brand character(10) NOT NULL,
p_size integer NOT NULL,
p_retailprice numeric(15,2) NOT NULL
);
insert into part2 select 1, 'brand1', random_between(0, 40), random_between(0, 40) from generate_series(1, 100000);
insert into part2 select 2, 'brand2', random_between(40, 80), random_between(0, 40) from generate_series(1, 100000);
insert into part2 select 3, 'brand1', random_between(0, 40), random_between(0, 40) from generate_series(1, 100000);
> On 20 Nov 2019, at 13:15, Andy Fan <zhihui.fan1213@gmail.com> wrote: > 2. why pg can't do it, while greenplum can? It's worth noting that Greenplum, the example you're referring to, is using a completely different query planner, and different planners have different characteristics and capabilities. cheers ./daniel
Daniel Gustafsson <daniel@yesql.se> writes: >> On 20 Nov 2019, at 13:15, Andy Fan <zhihui.fan1213@gmail.com> wrote: >> 2. why pg can't do it, while greenplum can? > It's worth noting that Greenplum, the example you're referring to, is using a > completely different query planner, and different planners have different > characteristics and capabilities. Yeah. TBH, I think the described transformation is well out of scope for what PG's planner tries to do. Greenplum is oriented to use-cases where it might be worth spending lots of planner cycles looking for optimizations like this one, but in a wider environment it's much harder to make the argument that this would be a profitable use of planner effort. I'm content to say that the application should have written the query with a GROUP BY to begin with. Having said that, the best form of criticism is a patch. If somebody actually wrote the code to do something like this, we could look at how much time it wasted in which unsuccessful cases and then have an informed discussion about whether it was worth adopting. (BTW, I do not think the transformation as described is even formally correct, at least not without some unstated assumptions. How is it okay to push down the "p_size > 40" condition into the subquery? The aggregation in the original query will include rows where that isn't true.) regards, tom lane
On Wed, Nov 20, 2019 at 08:15:19PM +0800, Andy Fan wrote: >Hi Hackers: > > First I found the following queries running bad on pg. > > select count(*) from part2 p1 where p_size > 40 and p_retailprice > >(select avg(p_retailprice) from part2 p2 where p2.p_brand=p1.p_brand); > >the plan is > QUERY PLAN >------------------------------------------------------------------------------------ > Aggregate (cost=1899310537.28..1899310537.29 rows=1 width=8) > -> Seq Scan on part2 p1 (cost=0.00..1899310456.00 rows=32513 width=0) > Filter: ((p_size > 40) AND (p_retailprice > (SubPlan 1))) > SubPlan 1 > -> Aggregate (cost=6331.00..6331.01 rows=1 width=32) > -> Seq Scan on part2 p2 (cost=0.00..5956.00 rows=150000 >width=4) > Filter: (p_brand = p1.p_brand) > >however if we change it to the following format, it runs pretty quick. > >select count(*) from part2, >(select p_brand, avg(p_retailprice) as avg_price from part2 where p_size > >40 group by p_brand) p2 >where p_retailprice > p2.avg_price >and p_size > 40 >and part2.p_brand = p2.p_brand; > >The above example comes from >https://community.pivotal.io/s/article/Pivotal-Query-Optimizer-Explained with >a litter modification. > >1. why pg can't translate the query 1 to query 2. after some checking >on pull_up_sublinks_qual_recurse, I still doesn't get the idea. >2. why pg can't do it, while greenplum can? > I don't know the exact place(s) in the optimizer that would need to consider this optimization, but the primary difference between the two queries is that the first one is correlated subquery, while the second one is not. So I guess our optimizer is not smart enough to recognize this pattern, and can't do the transformation from ... FROM p WHERE x > (SELECT avg(x) FROM q WHERE p.id = q.id) ... to ... FROM p, (SELECT q.id, avg(x) x FROM q) q2 WHERE q2.id = p.id AND q2.x < p.x I.e. we don't have the code to consider this optimization, because no one considered it interesting enough to submit a patch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote: >Daniel Gustafsson <daniel@yesql.se> writes: >>> On 20 Nov 2019, at 13:15, Andy Fan <zhihui.fan1213@gmail.com> wrote: >>> 2. why pg can't do it, while greenplum can? > >> It's worth noting that Greenplum, the example you're referring to, is using a >> completely different query planner, and different planners have different >> characteristics and capabilities. > >Yeah. TBH, I think the described transformation is well out of scope >for what PG's planner tries to do. Greenplum is oriented to use-cases >where it might be worth spending lots of planner cycles looking for >optimizations like this one, but in a wider environment it's much >harder to make the argument that this would be a profitable use of >planner effort. True. >I'm content to say that the application should have written the query >with a GROUP BY to begin with. > I'm not sure I agree with that. The problem is this really depends on the number of rows that will need the subquery result (i.e. based on selectivity of conditions in the outer query). For small number of rows it's fine to execute the subplan repeatedly, for large number of rows it's better to rewrite it to the GROUP BY form. It's hard to make those judgements in the application, I think. >Having said that, the best form of criticism is a patch. If somebody >actually wrote the code to do something like this, we could look at how >much time it wasted in which unsuccessful cases and then have an >informed discussion about whether it was worth adopting. > Right. >(BTW, I do not think the transformation as described is even formally >correct, at least not without some unstated assumptions. How is it >okay to push down the "p_size > 40" condition into the subquery? The >aggregation in the original query will include rows where that isn't >true.) Yeah. I think the examples are a bit messed up, and surely there are other restrictions on applicability of this optimization. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote: >> I'm content to say that the application should have written the query >> with a GROUP BY to begin with. > I'm not sure I agree with that. The problem is this really depends on > the number of rows that will need the subquery result (i.e. based on > selectivity of conditions in the outer query). For small number of rows > it's fine to execute the subplan repeatedly, for large number of rows > it's better to rewrite it to the GROUP BY form. It's hard to make those > judgements in the application, I think. Hm. That actually raises the stakes a great deal, because if that's what you're expecting, it would require planning out both the transformed and untransformed versions of the query before you could make a cost comparison. That's a *lot* harder to do in the context of our optimizer's structure, and it also means that the feature would consume even more planner cycles, than what I was envisioning (namely, a fixed jointree-prep-stage transformation similar to subquery pullup). I have no idea whether Greenplum really does it like that. regards, tom lane
On Wed, Nov 20, 2019 at 12:36:50PM -0500, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote: >>> I'm content to say that the application should have written the query >>> with a GROUP BY to begin with. > >> I'm not sure I agree with that. The problem is this really depends on >> the number of rows that will need the subquery result (i.e. based on >> selectivity of conditions in the outer query). For small number of rows >> it's fine to execute the subplan repeatedly, for large number of rows >> it's better to rewrite it to the GROUP BY form. It's hard to make those >> judgements in the application, I think. > >Hm. That actually raises the stakes a great deal, because if that's >what you're expecting, it would require planning out both the transformed >and untransformed versions of the query before you could make a cost >comparison. That's a *lot* harder to do in the context of our >optimizer's structure, and it also means that the feature would consume >even more planner cycles, than what I was envisioning (namely, a fixed >jointree-prep-stage transformation similar to subquery pullup). > >I have no idea whether Greenplum really does it like that. > True. I'm not really sure how exactly would the planning logic work or how Greenplum does it. It might be the case that based on the use cases they target they simply assume the rewritten query is the right one in 99% of the cases, so they do the transformation always. Not sure. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Nov 20, 2019 at 11:18 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Wed, Nov 20, 2019 at 12:36:50PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote:
>>> I'm content to say that the application should have written the query
>>> with a GROUP BY to begin with.
>
>> I'm not sure I agree with that. The problem is this really depends on
>> the number of rows that will need the subquery result (i.e. based on
>> selectivity of conditions in the outer query). For small number of rows
>> it's fine to execute the subplan repeatedly, for large number of rows
>> it's better to rewrite it to the GROUP BY form. It's hard to make those
>> judgements in the application, I think.
>
>Hm. That actually raises the stakes a great deal, because if that's
>what you're expecting, it would require planning out both the transformed
>and untransformed versions of the query before you could make a cost
>comparison. That's a *lot* harder to do in the context of our
>optimizer's structure, and it also means that the feature would consume
>even more planner cycles, than what I was envisioning (namely, a fixed
>jointree-prep-stage transformation similar to subquery pullup).
>
>I have no idea whether Greenplum really does it like that.
>
True. I'm not really sure how exactly would the planning logic work or
how Greenplum does it. It might be the case that based on the use cases
they target they simply assume the rewritten query is the right one in
99% of the cases, so they do the transformation always. Not sure.
The Greenplum page mentions they also added "join-aggregates reordering", in addition to subquery unnesting.
Costing pushing joins below aggregates could probably help.
It does increase plan search space quite a bit.
Regards,
Xun
On Wed, Nov 20, 2019 at 12:34:25PM -0800, Xun Cheng wrote: >On Wed, Nov 20, 2019 at 11:18 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> >wrote: > >> On Wed, Nov 20, 2019 at 12:36:50PM -0500, Tom Lane wrote: >> >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> >> On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote: >> >>> I'm content to say that the application should have written the query >> >>> with a GROUP BY to begin with. >> > >> >> I'm not sure I agree with that. The problem is this really depends on >> >> the number of rows that will need the subquery result (i.e. based on >> >> selectivity of conditions in the outer query). For small number of rows >> >> it's fine to execute the subplan repeatedly, for large number of rows >> >> it's better to rewrite it to the GROUP BY form. It's hard to make those >> >> judgements in the application, I think. >> > >> >Hm. That actually raises the stakes a great deal, because if that's >> >what you're expecting, it would require planning out both the transformed >> >and untransformed versions of the query before you could make a cost >> >comparison. That's a *lot* harder to do in the context of our >> >optimizer's structure, and it also means that the feature would consume >> >even more planner cycles, than what I was envisioning (namely, a fixed >> >jointree-prep-stage transformation similar to subquery pullup). >> > >> >I have no idea whether Greenplum really does it like that. >> > >> >> True. I'm not really sure how exactly would the planning logic work or >> how Greenplum does it. It might be the case that based on the use cases >> they target they simply assume the rewritten query is the right one in >> 99% of the cases, so they do the transformation always. Not sure. >> >> >The Greenplum page mentions they also added "join-aggregates reordering", >in addition to subquery unnesting. >Costing pushing joins below aggregates could probably help. >It does increase plan search space quite a bit. > We actually do have a patch for aggregate push-down [1]. But I don't think it's directly relevant to this thread - the main trick here is transforming the correlated subquery to aggregation, not moving the aggregation down. That seems like a separate optimization. [1] https://commitfest.postgresql.org/25/1247/ -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hm. That actually raises the stakes a great deal, because if that's
what you're expecting, it would require planning out both the transformed
and untransformed versions of the query before you could make a cost
comparison.
I don't know an official name, let's call it as "bloom filter push down (BFPD)" for reference. this algorithm may be helpful on this case with some extra effort.
First, Take . "select ... from t1, t2 where t1.a = t2.a and t1.b = 100" for example, and assume t1 is scanned before t2 scanning, like hash join/sort merge and take t1's result as inner table.
1. it first scan t1 with filter t1.b = 100;
2. during the above scan, it build a bloom filter based on the join key (t1.a) for the "selected" rows.
3. during scan t2.a, it filters t2.a with the bloom filter.
4. probe the the hash table with the filtered rows from the above step.
Back to this problem, if we have a chance to get the p_brand we are interested, we can use the same logic to only group by the p_brand.
Another option may be we just keep the N versions, and search them differently and compare their cost at last.
> The Greenplum page mentions they also added "join-aggregates reordering", in addition to subquery unnesting.
Thanks, I will search more about this. >Having said that, the best form of criticism is a patch. If somebody
>actually wrote the code to do something like this, we could look at how
>much time it wasted in which unsuccessful cases and then have an
>informed discussion about whether it was worth adopting.
>
>actually wrote the code to do something like this, we could look at how
>much time it wasted in which unsuccessful cases and then have an
>informed discussion about whether it was worth adopting.
>
I would try to see how far I can get.
On Thu, Nov 21, 2019 at 08:30:51AM +0800, Andy Fan wrote: >> >> >> Hm. That actually raises the stakes a great deal, because if that's >> what you're expecting, it would require planning out both the transformed >> and untransformed versions of the query before you could make a cost >> comparison. > > >I don't know an official name, let's call it as "bloom filter push down >(BFPD)" for reference. this algorithm may be helpful on this case with >some extra effort. > >First, Take . "select ... from t1, t2 where t1.a = t2.a and t1.b = 100" >for example, and assume t1 is scanned before t2 scanning, like hash >join/sort merge and take t1's result as inner table. > >1. it first scan t1 with filter t1.b = 100; >2. during the above scan, it build a bloom filter *based on the join key >(t1.a) for the "selected" rows.* >3. during scan t2.a, it filters t2.a with the bloom filter. >4. probe the the hash table with the filtered rows from the above step. > So essentially just a hash join with a bloom filter? That doesn't seem very relevant to this thread (at least I don't see any obvious link), but note that this has been discussed in the past - see [1]. And in some cases building a bloom filter did result in nice speedups, but in other cases it was just an extra overhead. But it does not require change of plan shape, unlike the optimization discussed here. [1] https://www.postgresql.org/message-id/flat/5670946E.8070705%402ndquadrant.com Ultimately there were discussions about pushing the bloom filter much deeper on the non-hash side, but that was never implemented. >Back to this problem, if we have a chance to get the p_brand we are >interested, we can use the same logic to only group by the p_brand. > >Another option may be we just keep the N versions, and search them >differently and compare their cost at last. > Maybe. I think the problem is going to be that with multiple such correlated queries you may significantly increase the number of plan variants to consider - each subquery may be transformed or not, so the space splits into 2. With 6 such subqueries you suddenly have 64x the number of plan variants you have to consider (I don't think you can just elimiate those early on). >> The Greenplum page mentions they also added "join-aggregates >reordering", in addition to subquery unnesting. >Thanks, I will search more about this. > >>Having said that, the best form of criticism is a patch. If somebody >>actually wrote the code to do something like this, we could look at how >>much time it wasted in which unsuccessful cases and then have an >>informed discussion about whether it was worth adopting. >> > >I would try to see how far I can get. +1 regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 21, 2019 at 6:12 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Thu, Nov 21, 2019 at 08:30:51AM +0800, Andy Fan wrote:
>>
>>
>> Hm. That actually raises the stakes a great deal, because if that's
>> what you're expecting, it would require planning out both the transformed
>> and untransformed versions of the query before you could make a cost
>> comparison.
>
>
>I don't know an official name, let's call it as "bloom filter push down
>(BFPD)" for reference. this algorithm may be helpful on this case with
>some extra effort.
>
>First, Take . "select ... from t1, t2 where t1.a = t2.a and t1.b = 100"
>for example, and assume t1 is scanned before t2 scanning, like hash
>join/sort merge and take t1's result as inner table.
>
>1. it first scan t1 with filter t1.b = 100;
>2. during the above scan, it build a bloom filter *based on the join key
>(t1.a) for the "selected" rows.*
>3. during scan t2.a, it filters t2.a with the bloom filter.
>4. probe the the hash table with the filtered rows from the above step.
>
So essentially just a hash join with a bloom filter?
Yes, the idea is exactly same but we treat the value differently (both are valid, and your point is more common) . In my opinion in some environment like oracle exadata, it is much more powerful since it transfers much less data from data node to compute node.
Of course, the benefit is not always, but it is a good beginning to make it smarter.
That doesn't seem very relevant to this thread (at least I don't see any obvious link),
The original problem "group by p_brand" for "all the rows" maybe not a good idea all the time, and if we can do some filter before the group by, the result would be better.
And in some
cases building a bloom filter did result in nice speedups, but in other
cases it was just an extra overhead. But it does not require change of
plan shape, unlike the optimization discussed here.
I thought we could add a step named "build the filter" and another step as "apply the filter". If so, the plan shape is changed. anyway I don't think this is a key point.
Ultimately there were discussions about pushing the bloom filter much
deeper on the non-hash side, but that was never implemented.
Do you still have any plan about this feature since I see you raised the idea and and the idea was very welcomed also?
>Back to this problem, if we have a chance to get the p_brand we are
>interested, we can use the same logic to only group by the p_brand.
>
>Another option may be we just keep the N versions, and search them
>differently and compare their cost at last.
>
Maybe. I think the problem is going to be that with multiple such
correlated queries you may significantly increase the number of plan
variants to consider - each subquery may be transformed or not, so the
space splits into 2. With 6 such subqueries you suddenly have 64x the
number of plan variants you have to consider (I don't think you can just
elimiate those early on).
>> The Greenplum page mentions they also added "join-aggregates
>reordering", in addition to subquery unnesting.
>Thanks, I will search more about this.
>
>>Having said that, the best form of criticism is a patch. If somebody
>>actually wrote the code to do something like this, we could look at how
>>much time it wasted in which unsuccessful cases and then have an
>>informed discussion about whether it was worth adopting.
>>
>
>I would try to see how far I can get.
+1
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 21, 2019 at 11:57:22PM +0800, Andy Fan wrote: >On Thu, Nov 21, 2019 at 6:12 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> >wrote: > >> On Thu, Nov 21, 2019 at 08:30:51AM +0800, Andy Fan wrote: >> >> >> >> >> >> Hm. That actually raises the stakes a great deal, because if that's >> >> what you're expecting, it would require planning out both the >> transformed >> >> and untransformed versions of the query before you could make a cost >> >> comparison. >> > >> > >> >I don't know an official name, let's call it as "bloom filter push down >> >(BFPD)" for reference. this algorithm may be helpful on this case with >> >some extra effort. >> > >> >First, Take . "select ... from t1, t2 where t1.a = t2.a and t1.b = 100" >> >for example, and assume t1 is scanned before t2 scanning, like hash >> >join/sort merge and take t1's result as inner table. >> > >> >1. it first scan t1 with filter t1.b = 100; >> >2. during the above scan, it build a bloom filter *based on the join key >> >(t1.a) for the "selected" rows.* >> >3. during scan t2.a, it filters t2.a with the bloom filter. >> >4. probe the the hash table with the filtered rows from the above step. >> > >> >> So essentially just a hash join with a bloom filter? > > >Yes, the idea is exactly same but we treat the value differently (both are >valid, and your point is more common) . In my opinion in some >environment like oracle exadata, it is much more powerful since it >transfers much less data from data node to compute node. > >Of course, the benefit is not always, but it is a good beginning to make >it smarter. > Yes, it certainly depends on the workload. As was discussed in the other thread, to get the most benefit we'd have to push the bloom filter down the other side of the join as far as possible, ideally to the scan nodes. But no one tried to do that. > >> That doesn't seem very relevant to this thread (at least I don't see any >> obvious link), >> > >The original problem "group by p_brand" for "all the rows" maybe not a >good idea all the time, and if we can do some filter before the group >by, the result would be better. > Well, I think vast majority of optimizations depend on the data. The reason why I think these two optimizations are quite different is that one (blom filter with hash joins) is kinda localized and does not change the general plan shape - you simply make the decision at the hash join level, and that's it (although it's true it does affect row counts on one side of the join). The optimization discussed here is very different because it requires transformation of the query very early, before we actually can judge if it's a good idea or not. >> And in some >> cases building a bloom filter did result in nice speedups, but in other >> cases it was just an extra overhead. But it does not require change of >> plan shape, unlike the optimization discussed here. >> > >I thought we could add a step named "build the filter" and another step as >"apply the filter". If so, the plan shape is changed. anyway I don't >think this is a key point. > Not sure. Perhaps there are similarities, but I don't see them. > >> >> Ultimately there were discussions about pushing the bloom filter much >> deeper on the non-hash side, but that was never implemented. > > >Do you still have any plan about this feature since I see you raised the >idea and and the idea was very welcomed also? > I'm not working on it, and I don't think I'll get to do that any time soon. So feel free to look into the problem if you wish. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services