Home > mailing lists

Re: First steps to being a contributer - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: First steps to being a contributer
Date	August 28, 2018 11:45:50
Msg-id	a7c0479a-3666-ea9d-c96d-2140bf7addfe@iki.fi Whole thread Raw
In response to	First steps to being a contributer (Daniel Wood <hexexpert@comcast.net>)
List	pgsql-hackers

Tree view

On 28/08/18 03:39, Daniel Wood wrote:
> Having quit Amazon, where I was doing Postgres development, I've
> started looking at various things I might work on for fun. One
> thought is to start with something easy like the scalability of
> GetSnapshotData(). :-)

Cool! :-)

> I recently found it interesting to examine performance while running
> near 1 million pgbench selects per sec on a 48 core/96 HT Skylake
> box. I noticed that additional sessions trying to connect were timing
> out when they got stuck in ProcArrayAdd trying to get the
> ProcArrayLock in EXCLUSIVE mode. FYI, scale 10000 with 2048 clients.
> 
> The question is whether it is possible that the problem with
> GetSnapshotData() has reached a critical point, with respect to
> snapshot scaling, on the newest high end systems.

Yeah, GetSnapshotData() certainly becomes a bottleneck in certain workloads.

> What I'd like is a short cut to any of the current discussions of
> various ideas to improve snapshot scaling. I have some of my own
> ideas but want to review things before posting them.

The main solution we've been discussing on -hackers over the last few 
years is changing the way snapshots work, to use a Commit Sequence 
Number. If we assign each transaction an CSN, then a snapshot is just a 
single integer, and GetSnapshotData() just needs to read the current 
value of the CSN counter. CSNs have problems of their own, of course 
:-). If you search the archives for "CSN", you'll find several threads 
on that.

Other less invasive ideas have also been thrown around. For example, 
when one backend acquires a snapshot, it could store a copy of that in 
shared memory. The next call to GetSnapshotData() could then just 
memcpy() the cached snapshot. Transaction commit would need to 
invalidate the cached copy. This helps, if you have a lot reads and few 
writes.

- Heikki

pgsql-hackers by date:

From: Aleksandr Parfenov
Date: 28 August 2018, 11:40:32
Subject: Re: Flexible configuration for full-text search

From: hubert depesz lubaczewski
Date: 28 August 2018, 12:02:21
Subject: Would it be possible to have parallel archiving?

Re: First steps to being a contributer - Mailing list pgsql-hackers

Previous

Next