Home > mailing lists

Re: pg_stat_*_columns? - Mailing list pgsql-hackers

From	Magnus Hagander
Subject	Re: pg_stat_*_columns?
Date	June 23, 2015 21:14:49
Msg-id	CABUevEwsv2ZaUhEo3R748CLfaGW0KS93-Wcioxz5avXW8xvTuw@mail.gmail.com Whole thread Raw
In response to	Re: pg_stat_*_columns? (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Jun 23, 2015 at 3:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Jun 21, 2015 at 11:43 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> But if the structure
>> >> got too big to map (on a 32-bit system), then you'd be sort of hosed,
>> >> because there's no way to attach just part of it. That might not be
>> >> worth worrying about, but it depends on how big it's likely to get - a
>> >> 32-bit system is very likely to choke on a 1GB mapping, and maybe even
>> >> on a much smaller one.
>> >
>> > Yeah, I'm quite worried about assuming that we can map a data structure
>> > that might be of very significant size into shared memory on 32-bit
>> > machines. The address space just isn't there.
>>
>> Considering the advantages of avoiding message queues, I think we
>> should think a little bit harder about whether we can't find some way
>> to skin this cat. As I think about this a little more, I'm not sure
>> there's really a problem with one stats DSM per database. Sure, the
>> system might have 100,000 databases in some crazy pathological case,
>> but the maximum number of those that can be in use is bounded by
>> max_connections, which means the maximum number of stats file DSMs we
>> could ever need at one time is also bounded by max_connections. There
>> are a few corner cases to think about, like if the user writes a
>> client that connects to all 100,000 databases in very quick
>> succession, we've got to jettison the old DSMs fast enough to make
>> room for the new DSMs before we run out of slots, but that doesn't
>> seem like a particularly tough nut to crack. If the stats collector
>> ensures that it never attaches to more than MaxBackends stats DSMs at
>> a time, and each backend ensures that it never attaches to more than
>> one stats DSM at a time, then 2 * MaxBackends stats DSMs is always
>> enough. And that's just a matter of bumping
>> PG_DYNSHMEM_SLOTS_PER_BACKEND from 2 to 4.
>>
>> In more realistic cases, it will probably be normal for many or all
>> backends to be connected to the same database, and the number of stats
>> DSMs required will be far smaller.
>
> What about a combination in the line of something like this: stats collector
> keeps the statistics in local memory as before. But when a backend needs to
> get a snapshot of it's data, it uses a shared memory queue to request it.
> What the stats collector does in this case is allocate a new DSM, copy the
> data into that DSM, and hands the DSM over to the backend. At this point the
> stats collector can forget about it, and it's up to the backend to get rid
> of it when it's done with it.

Well, there seems to be little point in having the stats collector
forget about a DSM that it could equally well have shared with the
next guy who wants a stats snapshot for the same database. That case
is surely *plenty* common enough to be worth optimizing for.

Right, we only need to drop it once we have received a stats message for it so something changed. And possibly that with a minimum time as well, as we have now, if we want to limit the potential churn.

Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

pgsql-hackers by date:

From: Piotr Stefaniak
Date: 23 June 2015, 20:55:11
Subject: Re: NULL passed as an argument to memcmp() in parse_func.c

From: Tom Lane
Date: 23 June 2015, 21:33:32
Subject: Should we back-patch SSL renegotiation fixes?

Re: pg_stat_*_columns? - Mailing list pgsql-hackers

Previous

Next