Re: Questions on query planner, join types, and work_mem - Mailing list pgsql-performance

From Hannu Krosing
Subject Re: Questions on query planner, join types, and work_mem
Date
Msg-id 1280948634.3735.69.camel@hvost
Whole thread Raw
In response to Re: Questions on query planner, join types, and work_mem  (Hannu Krosing <hannu@2ndquadrant.com>)
Responses Re: Questions on query planner, join types, and work_mem  (Greg Smith <greg@2ndquadrant.com>)
Re: Questions on query planner, join types, and work_mem  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Questions on query planner, join types, and work_mem  (Hannu Krosing <hannu@2ndquadrant.com>)
Re: Questions on query planner, join types, and work_mem  (Hannu Krosing <hannu@krosing.net>)
List pgsql-performance
On Wed, 2010-08-04 at 21:41 +0300, Hannu Krosing wrote:
> On Wed, 2010-08-04 at 14:00 -0400, Tom Lane wrote:

> > regression=# select name, setting from pg_settings where name like '%cost';
> >          name         | setting
> > ----------------------+---------
> >  cpu_index_tuple_cost | 0.005
> >  cpu_operator_cost    | 0.0025
> >  cpu_tuple_cost       | 0.01
> >  random_page_cost     | 4
> >  seq_page_cost        | 1
> > (5 rows)
> >
> > To model an all-in-RAM database, you can either dial down both
> > random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
> > to 1 and increase all the CPU costs.  The former is less effort ;-)
> >
> > It should be noted also that there's not all that much evidence backing
> > up the default values of the cpu_xxx_cost variables.  In the past those
> > didn't matter much because I/O costs always swamped CPU costs anyway.
> > But I can foresee us having to twiddle those defaults and maybe refine
> > the CPU cost model more, as all-in-RAM cases get more common.
>
> Especially the context switch + copy between shared buffers and system
> disk cache will become noticeable at these speeds.
>
> An easy way to test it is loading a table with a few indexes, once with
> a shared_buffers value, which is senough for only the main table and
> once with one that fits both table and indexes,

ok, just to back this up I ran the following test with 28MB and 128MB
shared buffers.

create table sbuf_test(f1 float, f2 float, f3 float);
create index sbuf_test1 on sbuf_test(f1);
create index sbuf_test2 on sbuf_test(f2);
create index sbuf_test3 on sbuf_test(f3);

and then did 3 times the following for each shared_buffers setting

truncate sbuf_test;
insert into sbuf_test
select random(), random(), random() from generate_series(1,600000);

the main table size was 31MB, indexes were 18MB each for total size of
85MB

in case of 128MB shared buffers, the insert run in 14sec (+/- 1 sec)

in case of 28MB shared buffers, the insert run between 346 and 431 sec,
that is 20-30 _times_ slower.

There was ample space for keeping the indexes in linux cache (it has 1GB
cached currently) though the system may have decided to start writing it
to disk, so I suspect that most of the time was spent copying random
index pages back and forth between shared buffers and disk cache.

I did not verify this, so there may be some other factors involved, but
this seems like the most obvious suspect.

--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training



pgsql-performance by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Questions on query planner, join types, and work_mem
Next
From: Greg Smith
Date:
Subject: Re: Questions on query planner, join types, and work_mem