Re: Recheck condition - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Recheck condition
Date
Msg-id 20071129105520.GB30574@svana.org
Whole thread Raw
In response to Re: Recheck condition  ("Josh Harrison" <joshques@gmail.com>)
Responses Re: Recheck condition  ("Josh Harrison" <joshques@gmail.com>)
List pgsql-general
Please always CC the list so other people can respond.

On Wed, Nov 28, 2007 at 10:21:39PM -0500, Josh Harrison wrote:
> > It isn't the recheck that's costing it, it's probably just that you're
> > matching a lot of rows. A bitmap scan classically needs a recheck
> > because if a lot of rows need to be stored it might remember only
> > blocks 2044-2060. It then needs to recheck each row as it comes through
> > to make sure it really matches the conditions.
> >
> What is this number 2044-2060? Is this a fixed number in postgres?

Ofcourse not. Have you read the documentation on explain yet?
http://www.postgresql.org/docs/8.2/static/using-explain.html

The point is that the bitmap may have an inexact representation of the
tuples that match. If your scan indicates you'll match 10 million
entries and you only want to use 16KB for your bitmap, obviously you
can't store all the locations exactly.

> For example if I have a table Person with 3 fields (name,city_id,age). And
> the table contains 1000 rows. The table has 2 indexes city_id and age
> If I have a query :
> SELECT * FROM PERSON WHERE city_id=5 AND AGE=30

The answer is "it depends". Postgres has a cost based planner, it will
estimate the costs of each different way of getting the result and use
the cheapest. The factors that are important is how many rows each
condition will match.

Given your table is only 8MB, the system may decide that it's all in
memory and just do a scan.

Or it maybe see that city_id is almost unique and use that index and
check the matches for the second condition. Or vice-versa.

Or maybe it will scan both indexes, calculate the intersection and then
looks up the matches in the heap (with a recheck).

> In other  words, Will this query cause 1000 random heap access or 10 random
> heap access ?

I don't know, run it and see.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Attachment

pgsql-general by date:

Previous
From: Howard Cole
Date:
Subject: Killing a session on windows
Next
From: Richard Huxton
Date:
Subject: Re: [Re] Re: [Re] Re: [Re] Re: [Re] Re: Unknown winsock error 10061while dumping a big database