Dave Tenny <tenny@attbi.com> writes:
> My application relies heavily on IN lists. The lists are primarily
> constant integers, so queries look like:
> SELECT val FROM table WHERE id IN (43, 49, 1001, 100002, ...)
> 1) PostgreSQL exhibits worse-than-linear performance behavior with
> respect to IN list size.
Yeah. There are a couple of places in the planner that have O(N^2)
behavior on sufficiently large WHERE clauses, due to building lists
in a naive way (repeated lappend() operations). The inner loop is
very tight, but nonetheless when you do it tens of millions of times
it adds up :-(
I have just committed some fixes into CVS tip for this --- I see about
a 10x speedup in planning time on test cases involving 10000-OR-item
WHERE clauses. We looked at this once before; the test cases I'm using
actually date back to Jan 2000. But it seems some slowness has crept
in due to subsequent planning improvements.
> 4) God help you if you haven't vacuum/analyzed that the newly loaded
> table.
Without knowledge that the id field is unique, the planner is likely to
tilt away from an indexscan with not too many IN items. I don't
consider this a bug.
> PostgreSQL craps out trying to process 8000 elements with the error:
> out of free buffers: time to abort!
This is a known bug in 7.3.2; it's fixed in 7.3.3.
regards, tom lane