From: "Craig Ringer" <craig@2ndquadrant.com>
> It sounds like they've produced a test case, so they should be able to
> with a bit of luck.
>
> Or even better, send you the test case.
I asked the user about this. It sounds like the relevant test case consists
of many scripts. He explained to me that the simplified test steps are:
1. initdb
2. pg_ctl start
3. Create 16 tables. Each of those tables consist of around 10 columns.
4. Insert 1000 rows into each of those 16 tables.
5. Launch 16 psql sessions concurrently. Each session updates all 1000 rows
of one table, e.g., session 1 updates table 1, session 2 updates table 2,
and so on.
6. Repeat step 5 50 times.
This sounds a bit complicated, but I understood that the core part is 16
concurrent updates, which should lead to contention on xlog insert slots
and/or spinlocks.
> Your next step here really needs to be to make this reproducible against
> a debug build. Then see if reverting the xlog scalability work actually
> changes the behaviour, given that you hypothesised that it could be
> involved.
Thank you, but that may be labor-intensive and time-consuming. In addition,
the user uses a machine with multiple CPU cores, while I only have a desktop
PC with two CPU cores. So I doubt I can reproduce the problem on my PC.
I asked the user to change S_UNLOCK to something like the following and run
the test during this weekend (the next Monday is a national holiday in
Japan).
#define S_UNLOCK(lock) InterlockedExchange(lock, 0)
FYI, the user reported today that the problem didn't occur when he ran the
same test for 24 hours on 9.3.5. Do you see something relevant in 9.4?
Regards
MauMau