Re: pg_stat_*_columns? - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: pg_stat_*_columns?
Date
Msg-id 5585936B.4080809@iki.fi
Whole thread Raw
In response to Re: pg_stat_*_columns?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 06/20/2015 11:32 AM, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I dunno that tweaking the format would accomplish much.  Where I'd love
>>> to get to is to not have to write the data to disk at all (except at
>>> shutdown).  But that seems to require an adjustable-size shared memory
>>> block, and I'm not sure how to do that.  One idea, if the DSM stuff
>>> could be used, is to allow the stats collector to allocate multiple
>>> DSM blocks as needed --- but how well would that work on 32-bit
>>> machines?  I'd be worried about running out of address space.

Hmm. A backend already reads all the stats it needs to backend-private 
memory in one go. That consumes about as much address space as the DSM 
would, no? I guess it'd double the need, because you'd have mapped both 
the DSM and the backend-private memory at the same time.

>> I've considered both that and to perhaps use a shared memory message queue
>> to communicate. Basically, have a backend send a request when it needs a
>> snapshot of the stats data and get a copy back through that method instead
>> of disk. It would be much easier if we didn't actually take a snapshot of
>> the data per transaction, but we really don't want to give that up (if we
>> didn't care about that, we could just have a protocol asking for individual
>> values).
>
> Yeah, that might work quite nicely, and it would not require nearly as
> much surgery on the existing code as mapping the stuff into
> constrained-size shmem blocks would do.  The point about needing a data
> snapshot is a good one as well; I'm not sure how we'd preserve that
> behavior if backends are accessing the collector's data structures
> directly through shmem.

Usually you use a lock for such things ;-). Acquire lock, copy to 
private memory, unlock. If that's too slow, have two copies of the 
structure in shared memory, one that's being updated, and one that's 
being read, and swap them periodically. Or something like that - this 
doesn't seem particularly hard.

> I wonder if we should think about replacing the IP-socket-based data
> transmission protocol with a shared memory queue, as well.

One problem is that the archiver process is not connected to shared 
memory, but calls pgstat_send_archiver() to update the stats whenever it 
has archived a file.

If we nevertheless replace the files with dynamic shared memory, and 
switch to using shared memory queues for communication, ISTM we might as 
well have all the backends update the shared memory directly, and get 
rid of the stats collector process altogether. If we didn't already have 
a stats collector, that certainly seems like a more straightforward design.

> Let's do the simple thing first, else maybe nothing will happen at all.

Yeah, there's that..

- Heikki




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Extension support for postgres_fdw
Next
From: Tom Lane
Date:
Subject: Re: Auto-vacuum is not running in 9.1.12