Re: Relation extension scalability - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Relation extension scalability |
Date | |
Msg-id | CAFiTN-u2dZiXLP9jMEtA+OanwE8VPUPFmcJkk+y4c64Sht_n=A@mail.gmail.com Whole thread Raw |
In response to | Re: Relation extension scalability (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Relation extension scalability
|
List | pgsql-hackers |
On Sat, Jan 23, 2016 at 12:19 PM, Amit Kapila <amit.kapila16@gmail.com> wrote
I will fix in next version of patch.
These pages are not going to be used immediately and we have done PageInit so i think it should be marked dirty before adding to FSM, so that if buffer get replaced out then it flushes the init data.
Yes, if HEAP_INSERT_SKIP_FSM is used and we use multi-extend atleast in current transaction it will not take pages from FSM and everytime it will do multi-extend, however pages will be used if there are parallel backend, but still not a good idea to extend every time in multiple chunk in current backend.
Ok
Few comments about patch:
Thanks for reviewing..
1.Patch is not getting compiled.1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared identifier1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared identifier1>src/backend/access/heap/hio.c(480): error C2065: 'buf' : undeclared identifier
Oh, My mistake, my preprocessor is ignoring this error and relacing it with BLKSIZE
I will fix in next version of patch.
2.! page = BufferGetPage(buffer);! PageInit(page, BufferGetPageSize(buf), 0);!! freespace = PageGetHeapFreeSpace(page);!MarkBufferDirty(buffer);! UnlockReleaseBuffer(buffer);!RecordPageWithFreeSpace(relation, BufferGetBlockNumber(buffer), freespace);What is the need to mark page dirty here, won't it automaticallybe markerd dirty once the page is used? I think it is requiredif you wish to WAL-log this action.
These pages are not going to be used immediately and we have done PageInit so i think it should be marked dirty before adding to FSM, so that if buffer get replaced out then it flushes the init data.
3. I think you don't need to multi-extend a relation ifHEAP_INSERT_SKIP_FSM is used, as for that case it anyways try toget a new page by extending a relation.
Yes, if HEAP_INSERT_SKIP_FSM is used and we use multi-extend atleast in current transaction it will not take pages from FSM and everytime it will do multi-extend, however pages will be used if there are parallel backend, but still not a good idea to extend every time in multiple chunk in current backend.
So i will change this..
4. Again why do you need this multi-extend optimization for localrelations (those only accessible to current backend)?
I think we can change this while adding the table level "extend_by_blocks" for local table we will not allow this property, so no need to change at this place.
What do you think ?
5. Do we need this for nbtree as well? One way to check thatis by Copying large data in table having index.
Ok, i will try this test and update.
Note: Test with both data and WAL on Magnetic Disk : No significant improvement visible
-- I think wall write is becoming bottleneck in this case.In that case, can you try the same test with un-logged tables?
OK
Also, it is good to check the performance of patch with read-write workload to ensure that extending relation in multiple-chunks should notregress such cases.
Ok
Currently i have kept extend_num_page as session level parameter but i think later we can make this as table property.
Any suggestion on this ?I think we should have a new storage_parameter at table levelextend_by_blocks or something like that instead of GUC. Thedefault value of this parameter should be 1 which means retaincurrent behaviour by default.
+1
--
pgsql-hackers by date: