Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans
Date
Msg-id CA+bJJbx-nbJP3UQy1YbSOBK9iBgw6CAEchw58yzv_RS6c2rKVA@mail.gmail.com
Whole thread Raw
In response to Limit Heap Fetches / Rows Removed by Filter in Index Scans  (Victor Blomqvist <vb@viblo.se>)
Responses Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans  (Victor Blomqvist <vb@viblo.se>)
List pgsql-general
Hi Victor:

On Fri, Aug 19, 2016 at 7:06 AM, Victor Blomqvist <vb@viblo.se> wrote:
> Is it possible to break/limit a query so that it returns whatever results
> found after having checked X amount of rows in a index scan?
>
> For example:
> create table a(id int primary key);
> insert into a select * from generate_series(1,100000);
>
> select * from a
> where id%2 = 0
> order by id limit 10
>
> In this case the query will "visit" 20 rows and filter out 10 of them. We
> can see that in the query plan:
> "Rows Removed by Filter: 10"
> "Heap Fetches: 20"
>
> Is it somehow possible to limit this query so that it only fetches X amount,
> in my example if we limited it to 10 Heap Fetches the query could return the
> first 5 rows?
>
>
> My use case is I have a table with 35 million rows with a geo index, and I
> want to do a KNN search but also limit the query on some other parameters.
> In some cases the other parameters restrict the query so much that Heap
> Fetches becomes several 100k or more, and in those cases I would like to
> have a limit to my query.

Well, if you accept more abstract limits (i.e. you do not depend on
heap fetches, you just want up to 5 even IDs from the first 10 IDs )
you could try:

with base as (select * from a order by id limit 10)
select * from base where id %2 = 0 order by id limit 5;

( Or do it with a subquery instead of a CTE ).

In general

select * from table where common_condition and filter_condition order
by xx limit N

becomes

with base as (select * from table where common_condition order by xx
limit base_fecthes)
select * from base where filter_condition order by XX limit N;

In the example common_condition is non existent, put it as true,
optimize after transforming.

Francisco Olarte.


pgsql-general by date:

Previous
From: Sameer Kumar
Date:
Subject: Re: PG vs ElasticSearch for Logs
Next
From: Thomas Güttler
Date:
Subject: Re: PG vs ElasticSearch for Logs