Re: Recheck condition - Mailing list pgsql-general

From Gregory Stark
Subject Re: Recheck condition
Date
Msg-id 87wsryk9ih.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: Recheck condition  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
"Martijn van Oosterhout" <kleptog@svana.org> writes:

> On Fri, Nov 30, 2007 at 11:27:24AM -0500, Josh Harrison wrote:
>> Thanks for your reply
>> Is there a way to get them not to use the
>> heap for intermediate result and go to heap only for final data? This will
>> drastically improve the performance but Im not sure if postgres can do that?
>> Will creating the index in a different way and/or rewriting the query in a
>> different way achieve this result?
>
> I'm trying to imagine what it would take to avoid the heap access after
> the index scan I don't think it's possible. It would require that the
> bitmaps generated by the bitmap scan have the person_id attached and
> then have the bitmap AND operation only happen if the person IDs match.
> No such machinary currently exists.

I think you're describing a star schema join. This is a common checklist item
for data warehousing databases.

The classic data warehouse has a table like "person" which has the info you're
looking for, and dozens of tables with person_id and possibly some associated
data. In some cases those tables don't even have any other data, the mere
existence of the person_id in that table is enough.

So a typical query could look like something like:

select *
  from person
 where person_id in (select person_id from people_who_used_service_in_the_past)
   and person_id in (select person_id from people_with_big_balances)
   and person_id in (select person_id from people_...)
   and person_id not in (select person_id from people_who_unsubscribed)
   and person_id not in (select person_id from people_who_we_mailed_last_week)

The best plan for this is to gather up the person_ids in a kind of bitmap scan
with a bitmap of ids. And once the bitmap is done scan an index on person for
just the matching records. Postgres doesn't support anything like this (yet:).

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

pgsql-general by date:

Previous
From: "Dave Page"
Date:
Subject: Re: libeay32.dll and libpq.dll
Next
From: Gregory Stark
Date:
Subject: Re: full_page_writes = off?