Re: pg_stat_lwlocks view - lwlocks statistics, round 2 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Date
Msg-id CA+TgmoZXvaOfQbkMkE4qBA1W6XwRDdGLyZWqjSpkbSK1jfmT7g@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_lwlocks view - lwlocks statistics, round 2  (Satoshi Nagayasu <snaga@uptime.jp>)
Responses Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Re: pg_stat_lwlocks view - lwlocks statistics, round 2
List pgsql-hackers
On Tue, Oct 16, 2012 at 11:31 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
> A flight-recorder must not be disabled. Collecting
> performance data must be top priority for DBA.

This analogy is inapposite, though, because a flight recorder rarely
crashes the aircraft.  If it did, people might have second thoughts
about the "never disable the flight recorder" rule.  I have had a
couple of different excuses to look into the overhead of timing
lately, and it does indeed seem that on many modern Linux boxes even
extremely frequent gettimeofday calls produce only very modest amounts
of overhead.  Sadly, the situation on Windows doesn't look so good.  I
don't remember the exact numbers but I think it was something like 40
or 60 or 80 times slower on the Windows box one of my colleagues
tested than it is on Linux.  And it turns out that that overhead
really is measurable and does matter if you do it in a code path that
gets run frequently.  Of course I am enough of a Linux geek that I
don't use Windows myself and curse my fate when I do have to use it,
but the reality is that we have a huge base of users who only use
PostgreSQL at all because it runs on Windows, and we can't just throw
those people under the bus.  I think that older platforms like HP/UX
likely have problems in this area as well although I confess to not
having tested.

That having been said, if we're going to do this, this is probably the
right approach, because it only calls gettimeofday() in the case where
the lock acquisition is contended, and that is a lot cheaper than
calling it in all cases.  Maybe it's worth finding a platform where
pg_test_timing reports that timing is very slow and then measuring how
much impact this has on something like a pgbench or pgbench -S
workload.  We might find that it is in fact negligible.  I'm pretty
certain that it will be almost if not entirely negligible on Linux but
that's not really the case we need to worry about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Gezeala M. Bacuño II
Date:
Subject: Re: [BUGS] BUG #7521: Cannot disable WAL log while using pg_dump
Next
From: Robert Haas
Date:
Subject: Re: Deprecating RULES