Re: New FSM patch - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: New FSM patch
Date
Msg-id 48D0F58D.8050605@sun.com
Whole thread Raw
In response to Re: New FSM patch  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: New FSM patch  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas napsal(a):
> Heikki Linnakangas wrote:

<snip>

> 
> Let me describe this test case first:
> - The test program calls RecordAndGetPageWithFreeSpace in a tight loop, 
> with random values. There's no activity to the heap. In normal usage, 
> the time spent in RecordAndGetWithFreeSpace is minuscule compared to the 
> heap and index updates that cause RecordAndGetWithFreeSpace to be called.
> - WAL was placed on a RAM drive. This is of course not how people set up 
> their database servers, but the point of this test was to measure CPU 
> speed and scalability. The impact of writing extra WAL is significant 
> and needs to be taken into account, but that's a separate test and 
> discussion, and needs to be considered in comparison to the WAL written 
> by heap and index updates.
> 

<snip>

> 
> Another surprise was how badly both implementations scale. On CVS HEAD, 
> I expected the performance to be roughly the same with 1 and 2 clients, 
> because all access to the FSM is serialized on the FreeSpaceLock. But 
> adding the 2nd client not only didn't help, but it actually made the 
> performance much worse than with a single client. Context switching or 
> cache line contention, perhaps? The new FSM implementation shows the 
> same effect, which was an even bigger surprise. At table sizes > 32 MB, 
> the FSM no longer fits on a single FSM page, so I expected almost a 
> linear speed up with bigger table sizes from using multiple clients. 
> That's not happening, and I don't know why. Although, going from 33MB to 
> 333 MB, the performance with 2 clients almost doubles, but it still 
> doesn't exceed that with 1 client.


I tested it with DTrace on Solaris 10 and 8CPUs SPARC machine. I got 
similar result as you. Main problem in your new implementation is 
locking. On small tables where FSM fits on one page clients spend about 
3/4 time to waiting on page lock. On medium tables (2level FSM) then 
InsertWal lock become significant - it takes 1/4 of waiting time. Page 
waiting takes "only" 1/3.

I think the main reason of scalability problem is that locking invokes 
serialization.

Suggestions:

1) remove WAL logging. I think that FSM record should be recovered 
during processing of others WAL records (like insert, update). Probably 
only we need full page write on first modification after checkpoint.

2) break lock - use only share lock for page locking and divide page for 
smaller part for exclusive locking (at least for root page)


However, your test case is too artificial. I'm going to run OLTP 
workload and test it with "real" workload.
    Zdenek







pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] libpq events patch (with sgml docs)
Next
From: Tom Lane
Date:
Subject: Re: Common Table Expressions (WITH RECURSIVE) patch