Re: lwlock contention with SSI - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: lwlock contention with SSI
Date
Msg-id 1412712385.80067.YahooMailNeo@web122306.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: lwlock contention with SSI  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: lwlock contention with SSI
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Oct 7, 2014 at 2:40 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Robert Haas <robertmhaas@gmail.com> wrote:
>>> About a month ago, I told Kevin Grittner in an off-list conversation
>>> that I'd work on providing him with some statistics about lwlock
>>> contention under SSI.  I then ran a benchmark on a 16-core,
>>> 64-hardware thread IBM server, testing read-only pgbench performance
>>> at scale factor 300 with 1, 8, and 32 clients (and an equal number of
>>> client threads).
>>
>> I hate to say this when I know how much work benchmarking is, but I
>> don't think any benchmark of serializable transactions has very
>> much value unless you set any transactions which don't write to
>> READ ONLY.  I guess it shows how a naive conversion by someone who
>> doesn't read the docs or chooses to ignore the advice on how to get
>> good performance will perform, but how interesting is that?
>>
>> It might be worth getting TPS numbers from the worst-looking test
>> from this run, but with the read-only run done after changing
>> default_transaction_read_only = on.  Some shops using serializable
>> transactions set that in the postgresql.conf file, and require that
>> any transaction which will be modifying data override it.
>
> Well, we could do that.  But I'm not sure it's very realistic.  The
> pgbench workload is either 100% write or 100% read, but most real
> work-loads are mixed; say, 95% read, 5% write.  If the client software
> has to be responsible for flipping default_transaction_read_only for
> every write transaction, or just doing BEGIN TRANSACTION READ WRITE
> and COMMIT around each otherwise-single-statement write transaction,
> that's a whole bunch of extra server round trips and complexity that
> most people are not going to want to bother with.

Well, people using serializable transactions have generally opted
to deal with that rather than using SELECT ... FOR UPDATE, LOCK
TABLE, etc.  There's no free lunch, and changing BEGIN to BEGIN
TRANSACTION READ WRITE for those transactions which are expected to
write data is generally a lot less bother than the other.  In fact,
most software I have seen using this has a transaction manager in
the Java code which pays attention to the definition of each type
of transaction -- so you override a default in a declaration.

> We can tell them that they have to do it anyway, of course.

The docs already recommend it.

I really would like to see the LW locking issues in SSI brought
up-to-date with the rest of the code, but I would rather focus on
the bottlenecks where people are fundamentally using good technique
rather than cases where they are not following the advice in the
docs[1], and doing so would massively boost performance without any
change to PostgreSQL.

A paper by the University of Sidney[2] found that in their tests
the bottleneck was the linked lists which track read-write
dependencies, reporting that at a concurrency of 128, "Our
profiling showed that PostgreSQL spend 2.3% of the overall runtime
in traversing these list, plus 10% of its runtime waiting on the
corresponding kernel mutexes."  This list is covered by
SerializableXactHashLock, so either or both of converting this to
something which is not O(N^2) or using lock-free access would
probably make a big difference in contention at higher concurrency.
(I think we may be able to do one or the other, but not both.)
Further tests may identify other bottlenecks in reasonable
workloads, but this seems sure to be one which needs attention.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.postgresql.org/docs/current/interactive/transaction-iso.html#XACT-SERIALIZABLE

[2] Hyungsoo Jung, Hyuck Han, Alan Fekete, Uwe Röhm, and Heon Y.
Yeom.  Performance of Serializable Snapshot Isolation on Multicore
Servers.  Technical Report 693, The University of Sidney School of
Information Technologies, December, 2012.
http://sydney.edu.au/engineering/it/research/tr/tr693.pdf
(Quote is from section 5.2, Shared System Data Structures,
subsection PostgreSQL.)



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Next
From: Alvaro Herrera
Date:
Subject: Re: SKIP LOCKED DATA (work in progress)