Re: New FSM patch - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: New FSM patch
Date
Msg-id 48C7EC43.7060204@enterprisedb.com
Whole thread Raw
In response to Re: New FSM patch  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
Responses Re: New FSM patch  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
Re: New FSM patch  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: New FSM patch  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
List pgsql-hackers
Zdenek Kotala wrote:
> Yesterday, I started to reviewing your patch. 

Thanks!

> 1) If I understand correctly the main goal is to improve FSM to cover 
> all pages in file which is useful for huge database.

That's not a goal per se, though it's true that the new FSM does cover 
all pages. The goals are to:
- eliminate max_fsm_pages and max_fsm_relations GUC variables, so that 
there's one thing less to configure
- make the FSM immediately available and useful after recovery (eg. warm 
standby)
- make it possible to retail update the FSM, which will be needed for 
partial vacuum

> 2) Did you perform any benchmark? Is there any performance improvement 
> or penalty?

Working on it.. I've benchmarked some bulk-insertion scenarios, and the 
new FSM is now comparable to the current implementation on those tests. 
See the o

I've also been working on a low level benchmark using a C user-defined 
function that exercises just the FSM, showing the very raw CPU 
performance vs. current implementation. More on that later, but ATM it 
looks like the new implementation can be faster or slower than the 
current one, depending on the table size.

The biggest potential performance issue, however, is the fact that the 
new FSM implementation is WAL-logged. That shows up dramatically in the 
raw test where there's no other activity than FSM lookups and updates, 
but will be much less interesting in real life where FSM lookups are 
always related to some other updates which are WAL-logged anyway.

I also ran some DBT-2 tests without think times, with a small number of 
warehouses. But the results of that had such a high variability from 
test to test, that any difference in FSM speed would've been lost in the 
noise.

Do you still have the iGen setup available? Want to give it a shot?

> 3) How it works when database has many active parallel connections?

The new FSM should in principle scale better than the old one. However, 
Simon raised a worry about the WAL-logging: WALInserLock can already 
become the bottleneck in OLTP-scenarios with very high load and many 
CPUs. The FSM isn't any worse than other actions that generate WAL, but 
naturally if you're bottlenecked by the WAL lock or bandwidth, any 
increase in WAL traffic will show up as an overall performance loss.

I'm not too worried about that, myself, because in typical scenarios the 
extra WAL traffic generated by the FSM should be insignificant in volume 
compared to all the other WAL traffic. But Simon will probably demand 
some hard evidence of that ;-).

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: Base64 decode/encode performance
Next
From: "Robert Haas"
Date:
Subject: Re: [PATCH] Cleanup of GUC units code