Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans
Date
Msg-id CA+bJJbyaS0UqHiGm5crJ0Qv=YuL1REBw=mCot6PSuA+r+3+0Tw@mail.gmail.com
Whole thread Raw
In response to Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans  (Victor Blomqvist <vb@viblo.se>)
Responses Re: Limit Heap Fetches / Rows Removed by Filter in Index Scans  (Victor Blomqvist <vb@viblo.se>)
List pgsql-general
Hi Victor:


On Fri, Aug 19, 2016 at 7:02 PM, Victor Blomqvist <vb@viblo.se> wrote:
> What I want to avoid is my query visiting the whole 1m rows to get a result,
> because in my real table that can take 100sec. At the same time I want the
> queries that only need to visit 1k rows finish quickly, and the queries that
> visit 100k rows at least get some result back.

You are going to have problems with that. If you just want to limit it
to max 100k rows, max 10 results my solution works, probably better as
nested selects than CTEs, but someone more knowledgeable in the
optimizer will need to say something ( or tests will be needed ). This
is because "the queries that visit 100k rows at least get some result
back." may be false, you may need to visit the whole 1M to get the
first result if you are unlucky. Just set ap=999 where id=1M and ask
for ap>=999 and you've got that degenerate case, which can only be
saved if you have an index on ap ( even with statistics, you would
need a full table scan to find it ).

If you are positive some results are in the first 100k rows, then my
method works fine, how fast will need to be tested with the real data.
You can even try using *10, *100, *1k of the real limit until you have
enough results if you want to time-limit your queries.


Francisco Olarte.


pgsql-general by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: Sequential vs. random values - number of pages in B-tree
Next
From: Merlin Moncure
Date:
Subject: Re: PG vs ElasticSearch for Logs