Question with hashed IN - Mailing list pgsql-hackers

From Stephan Szabo
Subject Question with hashed IN
Date
Msg-id 20030816201948.Y75834-100000@megazone.bigpanda.com
Whole thread Raw
Responses Re: Question with hashed IN
List pgsql-hackers
I've noticed that when the stats are wrong (like
in cases where you've loaded data but reltuples
hasn't been updated yet) that a hashed NOT IN
seems to take a significant time penalty.  Is
this to be expected?

On a pktest table with 1 million integers and a dual table with a single
integer and sort_mem set high enough to give a hashed subplan for the
various reltuples values, I saw the following behavior for

explain analyze select * from dual where a not in (select a from pktest);

with reltuples=1000 for pktest, query takes about 96 seconds
reltuples=10000, query takes about 15 seconds
reltuples=100000, query takes about 8 seconds

And the memory usage seemed to be the same even if I set sort_mem back
to 1024.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: "is_superuser" parameter creates inconsistencies
Next
From: Stephan Szabo
Date:
Subject: Re: Question with hashed IN