Thread: Please Help: PostgreSQL Query Optimizer

Please Help: PostgreSQL Query Optimizer

From

"Anjan Kumar. A."

Date:

11 December 2005, 06:52:14

I'm working on a project, whose implementation deals with PostgreSQL. A brief description of the project is given
below.

Project Description:
--------------------
In Main Memory DataBase(MMDB) entire database on the disk is loaded on to the main memory during initial startup
ofthe system. There after all the references are made to database on the main memory. When the system is going to
shutdown,we will write back the database on the main memory to disk. Here, for the sake of recovery we are writing
logrecords on to the disk during the transaction execution.

We want to implement MMDB by modifying PostgreSQL. We implemented our own Main Memory File System to store the
primarycopy of the database in main memory, and Modified the PostgreSQL to access the data in the Main Memory File
System.

Now, in our implementation Disk access is completely avoided during normal transaction execution. So, we need to
modifythe Query Optimizer of PostgreSQL so that it wont consider disk related costs during calculation of Query Costs.
QueryOptimizer should try to minimize the Processing Cost. The criteria for cost can be taken as the number of tuples
thathave to read/write from main memory, number of comparisons, etc.

Can any one tell me the modifications needs to be incorporated to PostgreSQL, so that it considers only Processing
Costsduring optimization of the Query.

In PostgreSQL, Path costs are measured in units of disk accesses. One sequential page fetch has cost 1. I think, in
PostgreSQLfollowing paramters are used in calculating the cost of the Query Path :

#random_page_cost = 4 # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01 # (same)
#cpu_index_tuple_cost = 0.001 # (same)
#cpu_operator_cost = 0.0025 # (same)
#effective_cache_size = 1000 # typically 8KB each

In our case we are reading pages from Main Memory File System, but not from Disk. Will it be sufficient, if we change
the default values of above paramters in "src/include/optimizer/cost.h and
src/backend/utils/misc/postgresql.conf.sample"as follows:

random_page_cost = 4;
cpu_tuple_cost = 2;
cpu_index_tuple_cost = 0.2;
cpu_operator_cost = 0.05;

Please help us in this regard. I request all of you to give comments/suggestions on this. Waiting for your kind help.

--
Thanks.

Anjan Kumar A.
MTech2, Comp Sci.,
www.cse.iitb.ac.in/~anjankumar
______________________________________________________________
May's Law:
The quality of correlation is inversly proportional to the density
of control. (The fewer the data points, the smoother the curves.)

Re: [DOCS] [HACKERS] Please Help: PostgreSQL Query Optimizer

From

Tom Lane

Date:

11 December 2005, 16:41:50

Josh Berkus <josh@agliodbs.com> writes:
> I don't see why you're increasing the various cpu_* costs.

You missed the point Josh --- these numbers are relative to the cost of
a page fetch, so if page fetch is measured in microseconds instead of
milliseconds, then you *do* want to bump the CPU costs up.

            regards, tom lane

Re: [HACKERS] Please Help: PostgreSQL Query Optimizer

From

"Anjan Kumar. A."

Date:

11 December 2005, 18:06:11

Since sequential access is not significantly faster than random access in a MMDB, random_page_cost will be
approximatelysame as sequential page fetch cost. 

As every thing is present in Main Memory, we need to give approximately same cost to read/write to Main Memory and CPU
Relatedoperations. 


But, in PostgreSQL  all costs are  scaled relative to a page fetch. If we make both sequential_page_fetch_cost and
random_page_costto "1", then  we need to increase the various cpu_* paramters by multiplying the default values with
appropriate Scaling Factor.  Now, we need to determine this Scaling Factor. 


Still, i want to confirm whether this approach is the correct one.





On Sun, 11 Dec 2005, Josh Berkus wrote:

> Anjan,
>
>> In our case we are reading pages from Main Memory File System, but not from
>> Disk. Will it be sufficient, if we change the  default values of above
>> paramters in "src/include/optimizer/cost.h and
>> src/backend/utils/misc/postgresql.conf.sample" as follows:
>>
>>          random_page_cost = 4;
>
> This should be dramatically lowered.  It's supposed to represent the ratio of
> seek-fetches to seq scans on disk.  Since there's no disk, it should be a
> flat 1.0.   However, we are aware that there are flaws in our calculations
> involving random_page_cost, such that the actual number for a system where
> there is no disk cost would be lower than 1.0.   Your research will hopefully
> help us find these flaws.
>
>>          cpu_tuple_cost = 2;
>>          cpu_index_tuple_cost = 0.2;
>>          cpu_operator_cost = 0.05;
>
> I don't see why you're increasing the various cpu_* costs.  CPU costs would be
> unaffected by the database being in memory.   In general, I lower these by a
> divisor based on the cpu speed; for example, on a dual-opteron system I lower
> the defaults by /6.   However, that's completely unrelated to using an MMDB.
>
> So, other than random_page_cost, I don't know of other existing GUCs that
> would be directly related to using a disk/not using a disk.  How are you
> handling shared memory and work memory?
>
> I look forward to hearing more about your test!
>
>

--
Regards.

Anjan Kumar A.
MTech2,  Comp Sci.,
www.cse.iitb.ac.in/~anjankumar
______________________________________________________________
Do not handicap your children by making their lives easy.
         -- Robert Heinlein

Re: [HACKERS] Please Help: PostgreSQL Query Optimizer

From

Josh Berkus

Date:

16 December 2005, 23:43:42

Anjan,

> In our case we are reading pages from Main Memory File System, but not from
> Disk. Will it be sufficient, if we change the  default values of above
> paramters in "src/include/optimizer/cost.h and
> src/backend/utils/misc/postgresql.conf.sample" as follows:
>
>          random_page_cost = 4;

This should be dramatically lowered.  It's supposed to represent the ratio of
seek-fetches to seq scans on disk.  Since there's no disk, it should be a
flat 1.0.   However, we are aware that there are flaws in our calculations
involving random_page_cost, such that the actual number for a system where
there is no disk cost would be lower than 1.0.   Your research will hopefully
help us find these flaws.

>          cpu_tuple_cost = 2;
>          cpu_index_tuple_cost = 0.2;
>          cpu_operator_cost = 0.05;

I don't see why you're increasing the various cpu_* costs.  CPU costs would be
unaffected by the database being in memory.   In general, I lower these by a
divisor based on the cpu speed; for example, on a dual-opteron system I lower
the defaults by /6.   However, that's completely unrelated to using an MMDB.

So, other than random_page_cost, I don't know of other existing GUCs that
would be directly related to using a disk/not using a disk.  How are you
handling shared memory and work memory?

I look forward to hearing more about your test!

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: [HACKERS] Please Help: PostgreSQL Query Optimizer

From

Josh Berkus

Date:

16 December 2005, 23:43:53

Anjan,

> But, in PostgreSQL  all costs are  scaled relative to a page fetch. If we
> make both sequential_page_fetch_cost and random_page_cost to "1", then  we
> need to increase the various cpu_* paramters by multiplying the default
> values with appropriate  Scaling Factor.  Now, we need to determine this
> Scaling Factor.

I see, so you're saying that because the real cost of a page fetch has
decreased, the CPU_* costs should increase proportionally because relative to
the real costs of a page fetch they should be higher?  That makes a sort of
sense.

The problem that you're going to run into is that currently we have no
particularly reason to believe that the various cpu_* costs are more than
very approximately correct as rules of thumb.  So I think you'd be a lot
better off trying to come up with some means of computing the real cpu costs
of each operation, rather than trying to calculate a multiple of numbers
which may be wrong in the first place.

I know that someone on this list was working on a tool to digest EXPLAIN
ANALYZE results and run statistics on them.   Can't remember who, though.

Also, I'm still curious on how you're handling shared_mem, work_mem and
maintenance_mem.  You didn't answer last time.

--
Josh Berkus
Aglio Database Solutions
San Francisco