On 29 Apr 2004 at 19:03, Manfred Koizar wrote:
> While the storage overhead could be reduced to 1 bit (not a joke) we'd
> still have the I/O overhead of locating and updating index tuples for
> every heap tuple deleted/updated.
But this is what a lot of DBMSs do and seem to do well enough. I can see that the
MVCC system gives additional problems, but maybe it shouldn't be dismissed so lightly.
Coming from a MS SQLServer platform I have spent a lot of time optimising SQL in
PostgreSQL to be comparable to SQLServer. For the most part I have done this, but
some things are just slower in PostgreSQL.
Recently  I have been looking at raw performance (CPU, IO) rather than the plans. I
have some test queries that (as far as I can determine) use the same access plans on
PostgreSQL and SQLServer. Getting to the detail, an index scan of an index on a
integer column (222512 rows) takes 60ms on SQLServer and 540ms on PostgreSQL.
A full seq table scan on the same table without the index on the other hand takes 370ms
in SQLServer and 420ms in PostgreSQL.
I know that the platforms are different (windows 2000 vs Linux 2.6.3), but the statement
was executed several times to make sure the index and data was in cache (no disk io)
on both systems. Same data, Same CPU, Same disks, Same memory, Same
motherboards.
The only thing I can think of is the way that the index scan is performed on each
platform, SQLServer can use the data directly from the index. This makes the biggest
difference in multi join statements where several of the intermediate tables do not need
to be accessed at all, the data is contained in the join indexes. This results in almost an
order of magnitude performance difference for the same data.
I would be nice to get a feel for how much performance loss would be incurred in
maintaining the index flags against possible performance gains for getting the data back
out again.
Regards,
Gary.