I've been benchmarking some very simple databases. By simple, I mean a table
like this:
CREATE TABLE bench (
id SERIAL,
data TEXT )
CREATE INDEX bench_data_index ON bench (data)
which is filled with 100k records, where the data values for each record are
distinct (the data for record N is "text_item_N").
I generate a random data value known to be in the table and lookup its id,
with "SELECT id FROM bench WHERE data = 'whatever'".
I run this for three seconds. PostgreSQL does OK: 4300 selects in 2999
msec. MySQL does a little better, but not much: 5500 selects in 2999 msec.
In my application, this kind of lookup turns out to be the bottleneck on
performance (which is why I'm benchmarking). On MySQL, I can speed it up
quite a bit by looking up more than one thing at a time:
SELECT id FROM bench WHERE data IN ('X', 'Y')
for example, gets about 4050 selects in 3 seconds, so that's 8100 records
looked up, compared to 5300 when they were done one at a time. It continues
to improve selecting more at a time. To my surprise, when I tried this
trick with PostgreSQL, it did not speed things up. In fact, it *massively*
slowed down--it only is getting 13 selects in 3 seconds, searching for two
at a time.
What's going on here?
PostgreSQL 7.4.2 on SuSE 9.1 Linux. (The version that comes with SuSE).
--
--Tim Smith