[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions - Mailing list pgsql-hackers
From | Mengxing Liu |
---|---|
Subject | [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions |
Date | |
Msg-id | 1ec4e20c.1d454.15ac062fdde.Coremail.liu-mx15@mails.tsinghua.edu.cn Whole thread Raw |
In response to | [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking inserializable transactions (Kevin Grittner <kgrittn@gmail.com>) |
Responses |
[HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict trackingin serializable transactions
|
List | pgsql-hackers |
> -----Original Messages----- > From: "Kevin Grittner" <kgrittn@gmail.com> > Sent Time: 2017-03-12 04:24:29 (Sunday) > To: "Mengxing Liu" <liu-mx15@mails.tsinghua.edu.cn> > Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org> > Subject: Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions > > On Fri, Mar 10, 2017 at 9:12 PM, Mengxing Liu > <liu-mx15@mails.tsinghua.edu.cn> wrote: > > > My name is Mengxing Liu. I am interested in the project "Eliminate > > O(N^2) scaling from rw-conflict tracking in serializable > > transactions”. After discussing with Kevin off-list, I think it's > > time to post discussion here. I am afraid that there are two things > > that I need your help. Thank you very much. > > Welcome to the -hackers list! This is a key part of how development > happens in the community. Don't be shy about posting questions and > ideas, but also expect fairly blunt discussion to occur at times; > don't let that put you off. > Thanks, Kevin. I will keep that in mind. > >>> So the task is clear. We can use a tree-like or hash-like data > >>> structure to speed up this function. > >> > >> Right -- especially with a large number of connections holding a > >> large number of conflicts. In one paper with high concurrency, they > >> found over 50% of the CPU time for PostgreSQL was going to these > >> functions (including functions called by them). This seems to me to > >> be due to the O(N^2) (or possibly worse) performance from the number > >> of connections. > > > > Anyone knows the title of this paper? I want to reproduce its > > workloads. > > I seem to remember there being a couple other papers or talks, but > this is probably the most informative: > > http://sydney.edu.au/engineering/it/research/tr/tr693.pdf > Thanks, It's exactly what I need. > >> Remember, I think most of the work here is going to be in > >> benchmarking. We not only need to show improvements in simple test > >> cases using readily available tools like pgbench, but think about > >> what types of cases might be worst for the approach taken and show > >> that it still does well -- or at least not horribly. It can be OK > >> to have some slight regression in an unusual case if the common > >> cases improve a lot, but any large regression needs to be addressed > >> before the patch can be accepted. There are some community members > >> who are truly diabolical in their ability to devise "worst case" > >> tests, and we don't want to be blind-sided by a bad result from one > >> of them late in the process. > >> > > > > Are there any documents or links introducing how to test and > > benchmark PostgreSQL? I may need some time to learn about it. > > There is pgbench: > > https://www.postgresql.org/docs/devel/static/pgbench.html > > A read-only load and a read/write mix should both be tested, > probably with a few different scales and client counts to force the > bottleneck to be in different places. The worst problems have been > seen with 32 or more cores on 4 or more sockets with a large number > of active connections. I don't know whether you have access to a > machine capable of putting this kind of stress on it (perhaps at > your university?), but if not, the community has access to various > resources we should be able to schedule time on. > There is a NUMA machine ( 120 cores, 8 sockets) in my lab. I think it's enough to put this kind of stress. Anyway, I would ask for help if necessary. > It may pay for you to search through the archives of the last year > or two to look for other benchmarks and see what people have > previously done. > I would try. -- Mengxing Liu
pgsql-hackers by date: