Home > mailing lists

Please Help: PostgreSQL Query Optimizer - Mailing list pgsql-hackers

From	Anjan Kumar. A.
Subject	Please Help: PostgreSQL Query Optimizer
Date	December 11, 2005 09:52:13
Msg-id	Pine.LNX.4.61.0512111610320.4525@nsl-33.cse.iitb.ac.in Whole thread Raw
Responses	Re: [DOCS] Please Help: PostgreSQL Query Optimizer (Tom Lane <tgl@sss.pgh.pa.us>) Re: Please Help: PostgreSQL Query Optimizer (Josh Berkus <josh@agliodbs.com>)
List	pgsql-hackers

Tree view

I'm working on a project, whose implementation deals with PostgreSQL. A brief description of the project is given
below.

Project Description:
--------------------
In Main Memory DataBase(MMDB) entire database on the disk is loaded on to the main memory during initial startup
ofthe system. There after all the references are made to database on the main memory. When the system is going to
shutdown,we will write back the database on the main memory to disk. Here, for the sake of recovery we are writing
logrecords on to the disk during the transaction execution.

We want to implement MMDB by modifying PostgreSQL. We implemented our own Main Memory File System to store the
primarycopy of the database in main memory, and Modified the PostgreSQL to access the data in the Main Memory File
System.

Now, in our implementation Disk access is completely avoided during normal transaction execution. So, we need to
modifythe Query Optimizer of PostgreSQL so that it wont consider disk related costs during calculation of Query Costs.
QueryOptimizer should try to minimize the Processing Cost. The criteria for cost can be taken as the number of tuples
thathave to read/write from main memory, number of comparisons, etc.

Can any one tell me the modifications needs to be incorporated to PostgreSQL, so that it considers only Processing
Costsduring optimization of the Query.

In PostgreSQL, Path costs are measured in units of disk accesses. One sequential page fetch has cost 1. I think, in
PostgreSQLfollowing paramters are used in calculating the cost of the Query Path :

#random_page_cost = 4 # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01 # (same)
#cpu_index_tuple_cost = 0.001 # (same)
#cpu_operator_cost = 0.0025 # (same)
#effective_cache_size = 1000 # typically 8KB each

In our case we are reading pages from Main Memory File System, but not from Disk. Will it be sufficient, if we change
the default values of above paramters in "src/include/optimizer/cost.h and
src/backend/utils/misc/postgresql.conf.sample"as follows:

random_page_cost = 4;
cpu_tuple_cost = 2;
cpu_index_tuple_cost = 0.2;
cpu_operator_cost = 0.05;

Please help us in this regard. I request all of you to give comments/suggestions on this. Waiting for your kind help.

--
Thanks.

Anjan Kumar A.
MTech2, Comp Sci.,
www.cse.iitb.ac.in/~anjankumar
______________________________________________________________
May's Law:
The quality of correlation is inversly proportional to the density
of control. (The fewer the data points, the smoother the curves.)

pgsql-hackers by date:

From: "Joshua D. Drake"
Date: 11 December 2005, 04:30:39
Subject: Re: Upcoming PG re-releases

From: Simon Riggs
Date: 11 December 2005, 12:28:11
Subject: Re: Reducing relation locking overhead

Please Help: PostgreSQL Query Optimizer - Mailing list pgsql-hackers

Previous

Next