Re: testing HS/SR - 1 vs 2 performance - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: testing HS/SR - 1 vs 2 performance
Date
Msg-id 1271246893.8305.1294.camel@ebony
Whole thread Raw
In response to Re: testing HS/SR - 1 vs 2 performance  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: testing HS/SR - 1 vs 2 performance  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Tue, 2010-04-13 at 21:09 +0300, Heikki Linnakangas wrote:
> Heikki Linnakangas wrote:
> > I could reproduce this on my laptop, standby is about 20% slower. I ran
> > oprofile, and what stands out as the difference between the master and
> > standby is that on standby about 20% of the CPU time is spent in
> > hash_seq_search(). The callpath is GetSnapshotDat() ->
> > KnownAssignedXidsGetAndSetXmin() -> hash_seq_search(). That explains the
> > difference in performance.
> 
> The slowdown is proportional to the max_connections setting in the
> standby. 20% slowdown might still be acceptable, but if you increase
> max_connections to say 1000, things get really slow. I wouldn't
> recommend max_connections=1000, of course, but I think we need to do
> something about this. Changing the KnownAssignedXids data structure from
> hash table into something that's quicker to scan. Preferably something
> with O(N), where N is the number of entries in the data structure, not
> the maximum number of entries it can hold as it is with the hash table
> currently.

There's a tradeoff here to consider. KnownAssignedXids faces two
workloads: one for each WAL record where we check if the xid is already
known assigned, one for snapshots. The current implementation is
optimised towards recovery performance, not snapshot performance.

> A quick fix would be to check if there's any entries in the hash table
> before scanning it. That would eliminate the overhead when there's no
> in-progress transactions in the master. But as soon as there's even one,
> the overhead comes back.

Any fix should be fairly quick because of the way its modularised - with
something like this in mind.

I'll try a circular buffer implementation, with fastpath.

Have something in a few hours.

-- Simon Riggs           www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: master in standby mode croaks
Next
From: Simon Riggs
Date:
Subject: Re: How to generate specific WAL records?