Home > mailing lists

Re: Relation extension scalability - Mailing list pgsql-hackers

From	Dilip Kumar
Subject	Re: Relation extension scalability
Date	March 25, 2016 17:05:12
Msg-id	CAFiTN-t5mDimOkBQpbewBjgKPN8XFzJAgC7OzBLJqQ35UebfEg@mail.gmail.com Whole thread Raw
In response to	Re: Relation extension scalability (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Relation extension scalability
List	pgsql-hackers

Tree view

On Fri, Mar 25, 2016 at 3:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

1. Callers who use GetPageWithFreeSpace() rather than
GetPageFreeSpaceExtended() will fail to find the new pages if the
upper map levels haven't been updated by VACUUM.

2. Even callers who use GetPageFreeSpaceExtended() may fail to find
the new pages. This can happen in two separate ways, namely (a) the

Yeah, that's the issue, if extended pages spills to next FSM page, then other waiters will not find those page, and one by one all waiters will end up adding extra pages.
for example, if there are ~30 waiters then

total blocks extended = (25(25+1)/2) *20 =~ 6500 pages.

This is not the case every time but whenever heap block go to new FSM page this will happen.

- FSM page case hold 4096 heap blk info, so after every 8th extend (assume 512 block will extend in one time), it will extend ~6500 pages

- Any new request to RelationGetBufferForTuple will be able to find those page, because by that time the backend which is extending the page would have set new block using RelationSetTargetBlock.
(there are still chances that some blocks can be completely left unused, until vacuum comes).

I have changed the patch as per the suggestion (just POC because performance number are not that great)

Below is the performance number comparison of base, previous patch(v13) and latest patch (v14).

performance of patch v14 is significantly low compared to v13, mainly I guess below reasons
1. As per above calculation v13 extend ~6500 block (after every 8th extend), and that's why it's performing well.

2. In v13 as soon as we extend the block we add to FSM so immediately available for new requester, (In this patch also I tried to add one by one to FSM and updated fsm tree till root after all pages added to FSM, but no significant improvement).

3. fsm_update_recursive doesn't seems like problem to me. does it ?

Copy 10000 tuples, of 4 bytes each..
---------------------------------------------
Client     base   patch v13   patch v14
1           118          147        126
2           217          276        269
4           210          421        347
8           166          630        375
16         145          813        415
32         124          985        451
64                         974        455

Insert 1000 tuples of 1K size each.

Client        base    patch v13     patch v14
1              117         124            119
2              111         126            119
4               51          128            124
8               43          149            131
16             40          217            120
32                           263            115
64                           248            109

Note: I think one thread number can be just run to run variance..

Does anyone see problem in updating the FSM tree, I have debugged and saw that we are able to get the pages properly from tree and same is visible in performance number of v14 compared to base.

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

multi_extend_v14_poc.patch

pgsql-hackers by date:

From: Robert Haas
Date: 25 March 2016, 16:54:30
Subject: Re: [PATCH] fix DROP OPERATOR to reset links to itself on commutator and negator

From: Dilip Kumar
Date: 25 March 2016, 17:32:22
Subject: Re: Move PinBuffer and UnpinBuffer to atomics

Re: Relation extension scalability - Mailing list pgsql-hackers

Attachment

Previous

Next