Re: Relation extension scalability - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Relation extension scalability |
Date | |
Msg-id | CA+TgmoYy8cYhAnY2i4KDqJeZWXE7pAR4OROfKAz05-AK=PEe6g@mail.gmail.com Whole thread Raw |
In response to | Re: Relation extension scalability (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: Relation extension scalability
|
List | pgsql-hackers |
On Fri, Mar 25, 2016 at 1:05 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Fri, Mar 25, 2016 at 3:04 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> 1. Callers who use GetPageWithFreeSpace() rather than >> GetPageFreeSpaceExtended() will fail to find the new pages if the >> upper map levels haven't been updated by VACUUM. >> >> 2. Even callers who use GetPageFreeSpaceExtended() may fail to find >> the new pages. This can happen in two separate ways, namely (a) the > > Yeah, that's the issue, if extended pages spills to next FSM page, then > other waiters will not find those page, and one by one all waiters will end > up adding extra pages. > for example, if there are ~30 waiters then > total blocks extended = (25(25+1)/2) *20 =~ 6500 pages. > > This is not the case every time but whenever heap block go to new FSM page > this will happen. I think we need to start testing these patches not only in terms of how *fast* they are but how *large* the relation ends up being when we're done. A patch that inserts the rows slower but the final relation is smaller may be better overall. Can you retest v13, v14, and master, and post not only the timings but the relation size afterwards? And maybe post the exact script you are using? > performance of patch v14 is significantly low compared to v13, mainly I > guess below reasons > 1. As per above calculation v13 extend ~6500 block (after every 8th extend), > and that's why it's performing well. That should be completely unnecessary, though. I mean, if the problem is that it's expensive to repeatedly acquire and release the relation extension lock, then bulk-extending even 100 blocks at a time should be enough to fix that, because you've reduced the number of times that has to be done by 99%. There's no way we should need to extend by thousands of blocks to get good performance. Maybe something like this would help: if (needLock) { if (!use_fsm) LockRelationForExtension(relation, ExclusiveLock); else if (!ConditionLockRelationForExtension(relation,ExclusiveLock)) { BlockNumber last_blkno = RelationGetNumberOfBlocks(relation); targetBlock = GetPageWithFreeSpaceExtended(relation, last_blkno, len + saveFreeSpace); if (targetBlock != InvalidBlockNumber) goto loop; LockRelationForExtension(relation, ExclusiveLock); targetBlock = GetPageWithFreeSpace(relation, len+ saveFreeSpace); if (targetBlock != InvalidBlockNumber) { UnlockRelationForExtension(relation,ExclusiveLock); goto loop; } RelationAddExtraBlocks(relation,bistate); } } I think this is better than what you had before with lastValidBlock, because we're actually interested in searching the free space map at the *very end* of the relation, not wherever the last target block happens to have been. We could go further still and have GetPageWithFreeSpace() always search the last, say, two pages of the FSM in all cases. But that might be expensive. The extra call to RelationGetNumberOfBlocks seems cheap enough here because the alternative is to wait for a contended heavyweight lock. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: