Re: [HACKERS] [GSOC][weekly report 3] Eliminate O(N^2) scalingfrom rw-conflict tracking in serializable transactions - Mailing list pgsql-hackers

From Mengxing Liu
Subject Re: [HACKERS] [GSOC][weekly report 3] Eliminate O(N^2) scalingfrom rw-conflict tracking in serializable transactions
Date
Msg-id 48acaab0.ff0b.15cc9c6158b.Coremail.liu-mx15@mails.tsinghua.edu.cn
Whole thread Raw
In response to Re: [HACKERS] [GSOC][weekly report 3] Eliminate O(N^2) scaling fromrw-conflict tracking in serializable transactions  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
> From: "Heikki Linnakangas" <hlinnaka@iki.fi>
>
> Hmm. The hash table ought to speed up the RWConflictExists() function
> right? Where in the flame graph is RWConflictExists()? If it only
> accounts for a small amount of the overall runtime, even drastic speedup
> there won't make much difference to the total.
>

Yes. It is much smaller than we have expected. I did a small test  for HTAB and linked list:
when the set size is smaller than 10, linked list is as quick as HTAB. Here is the result.
(x microseconds running 10 million searching)

setSize: 5, LIST: 423569, HTAB: 523681
setSize: 10, LIST: 540579, HTAB: 430111
setSize: 15, LIST: 752879, HTAB: 429519
setSize: 20, LIST: 940792, HTAB: 431388
setSize: 25, LIST: 1163760, HTAB: 446043
setSize: 30, LIST: 1352990, HTAB: 429057
setSize: 35, LIST: 1524097, HTAB: 431547
setSize: 40, LIST: 1714976, HTAB: 429023

As we can see, the hash table can only benefit in a very extreme situation.
However, even if there are 100 concurrent connections, the average length of conflict
list is about 10. linked list is not the bottleneck.

>
> Nope, there is no such function, I'm afraid.
>

Oh, that's really a bad news.

> > In a previous email, I reported that many backends wait for the lock “SerializableFinishedListLock”;
> > If we don't implement functions like “ReleaseOneSerializableXact” quickly, they will be the bottleneck.
>
> Yeah, that's probably what's causing the decrease in throughput you're
> seeing.
>

An new evidence: I use "SELECT wait_event_type, wait_event FROM pg_stat_activity;" and sum by event_type to analyze the
waitevent. 

The result of original version:   SerializableXactHashLock 27   predicate_lock_manager 512   WALWriteLock 3
SerializableFinishedListLock402 

The result of new version:   SerializableXactHashLock 38   predicate_lock_manager 473   WALWriteLock 3
SerializableFinishedListLock1068 

Obviously, more backends are blocked by SerializableFinishedListLock.

>
> You might need to also keep the linked lists, and use the hash table to
> only look up particular items in the linked list faster.
>
> I was surprised to see that you're creating a lot of smallish hash
> tables, three for each PredXact entry. I would've expected all the
> PredXact entries to share the same hash tables, i.e. have only three
> hash tables in total. That would be more flexible in allocating
> resources among entries. (It won't help with the quick-release, though)
>

Yes, it would looks more elegant and require less memory resources.
( because hash table objects also consume memory )
But just for performance, it would be less efficient than my patch.
Because it has to maintain linked lists, besides hash tables.  In other words,
it does more works than my patch.

Another point is that removing linked list may have more opportunities to reduce
lock contentions. Otherwise, we need a global lock to protect the linked list.

--
Sincerely


Mengxing Liu











pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] ASOF join
Next
From: Etsuro Fujita
Date:
Subject: [HACKERS] Comment typo in execMain.c