Suggestion for optimization - Mailing list pgsql-hackers

From Dann Corbit
Subject Suggestion for optimization
Date
Msg-id D90A5A6C612A39408103E6ECDD77B82906F424@voyager.corporate.connx.com
Whole thread Raw
List pgsql-hackers
It would be nice if total table cardinality could be maintained live.
So (after the initial vacuum) we update the cardinality for each table
in the system table (or perhaps add an entry to the table itself).
There are two reasons why this is an important optimization.  Firstly,
it is a psychological benefit for both benchmarks and customers when
doing a select count(*) from <tablename>.  This is something that pops
up all the time in benchmarks and customers do it too, in order to get a
feel for speed.  By storing the current number and incrementing for
every insert and decrementing for every delete, the count(*) case with
no where clause can return the value instantly.

The far more important reason is for optimizations.  An accurate
cardinality figure can greatly enhance the optimizer's ability to
perform joins in the correct order.

An example of a SQL system that does this sort of thing is Microsoft
SQL*Server.  If you have 100 million rows in a table and do:
SELECT COUNT(*) FROM table_name
it will return the correct number instantly.  The same is true for
Oracle.

It might also be possible to keep an array in memory of:
typedef struct tag_cardinality_list {char *table_name;unsigned long cardinality;
} cardinality_list;

and keep the data updated there with simple interlocked exchange
operations.  The list would be loaded on Postmaster startup and saved on
shutdown.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PQescapeBytea is not multibyte aware
Next
From: Jan Wieck
Date:
Subject: Re: timeout implementation issues