Re: Relation extension scalability - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Relation extension scalability |
Date | |
Msg-id | CAFiTN-sjzasvJZuZRSgpbLtdDZ95W2HD=htyA3us+5=BQMEvnw@mail.gmail.com Whole thread Raw |
In response to | Re: Relation extension scalability (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Relation extension scalability
Re: Relation extension scalability Re: Relation extension scalability |
List | pgsql-hackers |
On Mon, Jan 25, 2016 at 11:59 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
FIXED
Now I have added table level parameter for specifying the number of blocks, So do you think that we still need to block it, as user can control it, Moreover i think if user want to use for local table then no harm in it at least by extending in one shot he avoid multiple call of Extension lock, though there will be no contention.
1.Patch is not getting compiled.1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared identifierOh, My mistake, my preprocessor is ignoring this error and relacing it with BLKSIZE
FIXED
3. I think you don't need to multi-extend a relation ifHEAP_INSERT_SKIP_FSM is used, as for that case it anyways try toget a new page by extending a relation.So i will change this..
FIXED
4. Again why do you need this multi-extend optimization for localrelations (those only accessible to current backend)?I think we can change this while adding the table level "extend_by_blocks" for local table we will not allow this property, so no need to change at this place.What do you think ?
Now I have added table level parameter for specifying the number of blocks, So do you think that we still need to block it, as user can control it, Moreover i think if user want to use for local table then no harm in it at least by extending in one shot he avoid multiple call of Extension lock, though there will be no contention.
What is your opinion ?
5. Do we need this for nbtree as well? One way to check thatis by Copying large data in table having index.Ok, i will try this test and update.
I tried to load data to table with Index and tried to analyze bottleneck using perf, and found btcompare was taking maximum time, still i don't deny that it can not get benefited by multi extend.
So i tried quick POC for this, but i realize that even though we extend multiple page for index and add to FSM, it will be updated only in current page, Information about new free page will be propagated to root page only during vacuum, And unlike heap Btree always search FSM from root and it will not find the extra added pages.
So i think we can analyze this topic separately for index multi extend and find is there are cases where index multi extend can give benefit.
Note: Test with both data and WAL on Magnetic Disk : No significant improvement visible
-- I think wall write is becoming bottleneck in this case.In that case, can you try the same test with un-logged tables?
Date with un-logged table
Test Init:
------------
./psql -d postgres -c "COPY (select g.i::text FROM generate_series(1, 10000) g(i)) TO '/tmp/copybinary' WITH BINARY";
echo COPY data from '/tmp/copybinary' WITH BINARY; > copy_script
./psql -d postgres -c "create unlogged table data (data text)" --> base
./psql -d postgres -c "create unlogged table data (data text) with(extend_by_blocks=50)" --patch
test_script:
------------
./psql -d postgres -c "truncate table data"
./psql -d postgres -c "checkpoint"
./pgbench -f copy_script -T 120 -c$ -j$ postgres
Shared Buffer 48GB
Table: Unlogged Table
ench -c$ -j$ -f -M Prepared postgres
Clients Base patch
1 178 180
2 337 338
4 265 601
8 167 805
------------
./psql -d postgres -c "COPY (select g.i::text FROM generate_series(1, 10000) g(i)) TO '/tmp/copybinary' WITH BINARY";
echo COPY data from '/tmp/copybinary' WITH BINARY; > copy_script
./psql -d postgres -c "create unlogged table data (data text)" --> base
./psql -d postgres -c "create unlogged table data (data text) with(extend_by_blocks=50)" --patch
test_script:
------------
./psql -d postgres -c "truncate table data"
./psql -d postgres -c "checkpoint"
./pgbench -f copy_script -T 120 -c$ -j$ postgres
Shared Buffer 48GB
Table: Unlogged Table
ench -c$ -j$ -f -M Prepared postgres
Clients Base patch
1 178 180
2 337 338
4 265 601
8 167 805
Also, it is good to check the performance of patch with read-write workload to ensure that extending relation in multiple-chunks should notregress such cases.
Ok
I did not find in regression in normal case.
Note: I tested it with previous patch extend_num_pages=10 (guc parameter) so that we can see any impact on overall system.
Currently i have kept extend_num_page as session level parameter but i think later we can make this as table property.
Any suggestion on this ?I think we should have a new storage_parameter at table levelextend_by_blocks or something like that instead of GUC. Thedefault value of this parameter should be 1 which means retaincurrent behaviour by default.+1
Changed it to table level storage parameter. I kept max_limit to 100 any suggestion on this ? should we increase it ?
latest patch is attached..
Attachment
pgsql-hackers by date: