Re: adding new pages bulky way - Mailing list pgsql-hackers

From Qingqing Zhou
Subject Re: adding new pages bulky way
Date
Msg-id d868sp$1efk$1@news.hub.org
Whole thread Raw
In response to adding new pages bulky way  ("Victor Y. Yegorov" <viy@mits.lv>)
Responses Re: adding new pages bulky way
List pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes
>
> I very seriously doubt that there would be *any* win
>

I did a quick proof-concept implemenation to test non-concurrent batch
insertion, here is the result:

Envrionment:   - Pg8.0.1   - NTFS / IDE


-- batch 16 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4167.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8111.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16444.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41980.000 ms

-- batch 32 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4086.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 7861.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 16403.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 41290.000 ms

-- batch 64 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4236.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8202.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17265.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 44063.000 ms

-- batch 128 pages extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4256.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 8242.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 17375.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 43854.000 ms

-- one page extension --
test=# insert into t select * from t;
INSERT 0 131072
Time: 4496.000 ms
test=# insert into t select * from t;
INSERT 0 262144
Time: 9013.000 ms
test=# insert into t select * from t;
INSERT 0 524288
Time: 19508.000 ms
test=# insert into t select * from t;
INSERT 0 1048576
Time: 49962.000 ms

Benefits are there, and it is an approximate 10% improvement if we select
good batch size. The explaination is: if a batch insertion need 6400 new
pages, originally it does write()+file system logs 6400 times, now it does
6400/64 times(though each time the time cost is bigger). Also, considering
write with different size have different cost, seems for my machine 32 is
the an optimal choice.

What I did include:

(1) md.c
Modify function mdextend():   - extend 64 pages each time;   - after extension, let FSM be aware of it (change FSM a
littlebit so it
 
could report freespace also for an empty page)

(2) bufmgr.c
make ReadPage(+empty_page) treat different of an empty page and non-empty
one to avoid unnecesary read for new pages, that is:   if (!empty_page)       smgrread(reln->rd_smgr, blockNum, (char
*)MAKE_PTR(bufHdr->data));   else       PageInit((char *) MAKE_PTR(bufHdr->data), BLCKSZ, 0); /* Only for
 
heap pages and race could be here ... */

(3) hio.c
RelationGetBufferForTuple(): - pass correct "empty_page" parameter to ReadPage() according to the query
result from FSM.

Regards,
Qingqing





pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: linuxtag 2005
Next
From: Peter Eisentraut
Date:
Subject: Re: The Contrib Roundup (long)