Thread: Proposal: roll pg_stat_statements into core

Proposal: roll pg_stat_statements into core

From

David Fetter

Date:

01 September 2019, 21:00:26

Folks,

I'd like to $Subject, on by default, with a switch to turn it off for
those really at the outer edges of performance. Some reasons include:

- It's broadly useful.
- Right now, the barrier for turning it on is quite high. In addition
  to being non-core, which is already a pretty high barrier at a lot
  of organizations, it requires a shared_preload_libraries setting,
  which is pretty close to untenable in a lot of use cases.
- The overhead for most use cases is low compared to the benefit.

Before I go do the patch, I'd like to see whether there's anything
like a consensus as to the what and how of doing this.

What say?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Pavel Stehule

Date:

01 September 2019, 21:12:15

ne 1. 9. 2019 v 20:00 odesílatel David Fetter <david@fetter.org> napsal:

Folks,

I'd like to $Subject, on by default, with a switch to turn it off for
those really at the outer edges of performance. Some reasons include:

- It's broadly useful.
- Right now, the barrier for turning it on is quite high. In addition
to being non-core, which is already a pretty high barrier at a lot
of organizations, it requires a shared_preload_libraries setting,
which is pretty close to untenable in a lot of use cases.
- The overhead for most use cases is low compared to the benefit.

I have not a strong opinion about it. pg_stat_statements is really useful extenstion, on second hand

1. the API is not stabilized yet - there are some patches related to this extension if I remember correctly

2. there are not any numbers what is a overhead

Maybe better solution can be some new API for shared memory, that doesn't need to use shared_preload_library.

It can be useful for lot of other monitoring extensions, profilers, debuggers,

Regards

Pavel

Before I go do the patch, I'd like to see whether there's anything
like a consensus as to the what and how of doing this.

What say?

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

David Fetter

Date:

01 September 2019, 21:48:29

On Sun, Sep 01, 2019 at 08:12:15PM +0200, Pavel Stehule wrote:
> Hi
> 
> ne 1. 9. 2019 v 20:00 odesílatel David Fetter <david@fetter.org> napsal:
> 
> > Folks,
> >
> > I'd like to $Subject, on by default, with a switch to turn it off for
> > those really at the outer edges of performance. Some reasons include:
> >
> > - It's broadly useful.
> > - Right now, the barrier for turning it on is quite high. In addition
> >   to being non-core, which is already a pretty high barrier at a lot
> >   of organizations, it requires a shared_preload_libraries setting,
> >   which is pretty close to untenable in a lot of use cases.
> > - The overhead for most use cases is low compared to the benefit.
> >
> 
> I have not a strong opinion about it. pg_stat_statements is really useful
> extenstion, on second hand
> 
> 1. the API is not stabilized yet - there are some patches related to this
> extension if I remember correctly

You do.

> 2. there are not any numbers what is a overhead

What numbers would you suggest collecting?  We could get some by
running them with the extension loaded and not, although this goes to
your next proposal.

> Maybe better solution can be some new API for shared memory, that doesn't
> need to use shared_preload_library.

What would such an API look like?

> It can be useful for lot of other monitoring extensions, profilers,
> debuggers,

It would indeed.

Do you see this new API as a separate project, and if so, of
approximately what size?  Are we talking about something about the
size of DSM? Of JIT?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Tom Lane

Date:

01 September 2019, 21:54:34

David Fetter <david@fetter.org> writes:
> - It's broadly useful.

Maybe.  Whether it can be argued that it's so broadly useful as
to justify being on-by-default is not clear.

> - Right now, the barrier for turning it on is quite high. In addition
>   to being non-core, which is already a pretty high barrier at a lot
>   of organizations, it requires a shared_preload_libraries setting,
>   which is pretty close to untenable in a lot of use cases.

That argument seems pretty weak.  It's part of contrib and therefore
maintained by the same people as the "core" code.  Also, I don't buy
for a minute that people who would need it don't also need a bunch of
other changes in default GUC settings (shared_buffers etc).

> - The overhead for most use cases is low compared to the benefit.

Please do not expect that we're going to accept such assertions
unsupported by evidence.  (As a very quick-n-dirty test, I tried
"pgbench -S" and got somewhere around 4% TPS degradation with
pg_stat_statements loaded.  That doesn't seem negligible.)

I think also that we would need to consider the migration penalty
for people who already have the contrib version installed.  To
judge by previous cases (I'm thinking tsearch2) that could be
pretty painful.  Admittedly, tsearch2 might not be a good comparison,
but points as simple as "the views/functions won't be in the same
schema as before" are enough to cause trouble.

            regards, tom lane

Re: Proposal: roll pg_stat_statements into core

From

Pavel Stehule

Date:

01 September 2019, 21:58:38

ne 1. 9. 2019 v 20:48 odesílatel David Fetter <david@fetter.org> napsal:

On Sun, Sep 01, 2019 at 08:12:15PM +0200, Pavel Stehule wrote:
> Hi
>
> ne 1. 9. 2019 v 20:00 odesílatel David Fetter <david@fetter.org> napsal:
>
> > Folks,
> >
> > I'd like to $Subject, on by default, with a switch to turn it off for
> > those really at the outer edges of performance. Some reasons include:
> >
> > - It's broadly useful.
> > - Right now, the barrier for turning it on is quite high. In addition
> > to being non-core, which is already a pretty high barrier at a lot
> > of organizations, it requires a shared_preload_libraries setting,
> > which is pretty close to untenable in a lot of use cases.
> > - The overhead for most use cases is low compared to the benefit.
> >
>
> I have not a strong opinion about it. pg_stat_statements is really useful
> extenstion, on second hand
>
> 1. the API is not stabilized yet - there are some patches related to this
> extension if I remember correctly

You do.

> 2. there are not any numbers what is a overhead

What numbers would you suggest collecting? We could get some by
running them with the extension loaded and not, although this goes to
your next proposal.

I would to see some benchmarks (pg_bench - readonly, higher number of connects)

> Maybe better solution can be some new API for shared memory, that doesn't
> need to use shared_preload_library.

What would such an API look like?

possibility to allocate shared large blocks of shared memory without necessity to do it at startup time

> It can be useful for lot of other monitoring extensions, profilers,
> debuggers,

It would indeed.

Do you see this new API as a separate project, and if so, of
approximately what size? Are we talking about something about the
size of DSM? Of JIT?

Probably about DSM, and about API how to other processes can connect to some blocks of DSM. I rember similar request related to shared fulltext dictionaries

It can be part of some project "enhancing pg_stat_statement - be possible load this extension without restart"

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Adrien Nayrat

Date:

02 September 2019, 11:11:28

On 9/1/19 8:54 PM, Tom Lane wrote:
>> - The overhead for most use cases is low compared to the benefit.
> Please do not expect that we're going to accept such assertions
> unsupported by evidence.  (As a very quick-n-dirty test, I tried
> "pgbench -S" and got somewhere around 4% TPS degradation with
> pg_stat_statements loaded.  That doesn't seem negligible.)

AFAIR Andres pointed overhead could be much more when you have more queries than
pg_stat_statements.max [1]. Eviction can be costly.

1: https://twitter.com/AndresFreundTec/status/1105585237772263424

Attachment

signature.asc

Re: Proposal: roll pg_stat_statements into core

From

Euler Taveira

Date:

02 September 2019, 16:31:20

Em seg, 2 de set de 2019 às 05:11, Adrien Nayrat
<adrien.nayrat@anayrat.info> escreveu:
>
> On 9/1/19 8:54 PM, Tom Lane wrote:
> >> - The overhead for most use cases is low compared to the benefit.
> > Please do not expect that we're going to accept such assertions
> > unsupported by evidence.  (As a very quick-n-dirty test, I tried
> > "pgbench -S" and got somewhere around 4% TPS degradation with
> > pg_stat_statements loaded.  That doesn't seem negligible.)
>
> AFAIR Andres pointed overhead could be much more when you have more queries than
> pg_stat_statements.max [1]. Eviction can be costly.
>
pg_stat_statements is important in a lot of query analysis. If you
make a comparison between pg_stat_statements and pgbadger, you can
capture queries for the latter changing log_min_duration_statement or
log_statement without restart the server, however, the former you
can't without restarting the server. At least if pg_stat_statements
was in core you could load it by default and have a GUC to turn it
on/off without restarting the server (that was Magnus proposal and
Andres agreed). I support this idea. In v10 we changed some GUCs to
perform replication out-of-box; we should do the same for query
analysis.

Regards,


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: Proposal: roll pg_stat_statements into core

From

Tom Lane

Date:

02 September 2019, 19:07:17

Euler Taveira <euler@timbira.com.br> writes:
> At least if pg_stat_statements
> was in core you could load it by default and have a GUC to turn it
> on/off without restarting the server (that was Magnus proposal and
> Andres agreed).

That assertion is 100% bogus.  To turn it on or off on-the-fly,
you'd need some way to acquire or release its shared memory
on-the-fly.

It's probably now possible to do something like that, using the
DSM mechanisms, but the code for it hasn't been written (AFAIK).
And it wouldn't have much to do with whether the module was
in core or stayed where it is.

Another concern that I have about moving pg_stat_statements
into core is that doing so would effectively nail down
Query.queryId as belonging to pg_stat_statements, whereas
currently it's possible for other plugins to commandeer that
if they wish.  This isn't academic, because of the fact that
not everybody is satisfied with the way pg_stat_statements
defines queryId [1].

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/1553029215728-0.post%40n3.nabble.com

Re: Proposal: roll pg_stat_statements into core

From

Michael Paquier

Date:

03 September 2019, 09:37:27

grOn Mon, Sep 02, 2019 at 12:07:17PM -0400, Tom Lane wrote:
> Euler Taveira <euler@timbira.com.br> writes:
>> At least if pg_stat_statements
>> was in core you could load it by default and have a GUC to turn it
>> on/off without restarting the server (that was Magnus proposal and
>> Andres agreed).
>
> That assertion is 100% bogus.  To turn it on or off on-the-fly,
> you'd need some way to acquire or release its shared memory
> on-the-fly.
>
> It's probably now possible to do something like that, using the
> DSM mechanisms, but the code for it hasn't been written (AFAIK).
> And it wouldn't have much to do with whether the module was
> in core or stayed where it is.

If we were to actually merge the module into core and switch to DSM
instead of the current fixed amout of shared memory defined at start
time, then that would be a two-step process: first push the functions
into code with a GUC_POSTMASTER as currently done, and secondly
attempt to switch the GUC to be reloadable.

FWIW, I am not sure that we should have the module into core.
--
Michael

Attachment

signature.asc

Re: Proposal: roll pg_stat_statements into core

From

Ibrar Ahmed

Date:

03 September 2019, 20:10:56

On Tue, Sep 3, 2019 at 11:37 AM Michael Paquier <michael@paquier.xyz> wrote:

grOn Mon, Sep 02, 2019 at 12:07:17PM -0400, Tom Lane wrote:
> Euler Taveira <euler@timbira.com.br> writes:
>> At least if pg_stat_statements
>> was in core you could load it by default and have a GUC to turn it
>> on/off without restarting the server (that was Magnus proposal and
>> Andres agreed).
>
> That assertion is 100% bogus. To turn it on or off on-the-fly,
> you'd need some way to acquire or release its shared memory
> on-the-fly.
>
> It's probably now possible to do something like that, using the
> DSM mechanisms, but the code for it hasn't been written (AFAIK).
> And it wouldn't have much to do with whether the module was
> in core or stayed where it is.

If we were to actually merge the module into core and switch to DSM
instead of the current fixed amout of shared memory defined at start
time, then that would be a two-step process: first push the functions
into code with a GUC_POSTMASTER as currently done, and secondly
attempt to switch the GUC to be reloadable.

FWIW, I am not sure that we should have the module into core.
--
Michael

- It's broadly useful.
No doubt it is very useful and most of the customer is using that.

- Right now, the barrier for turning it on is quite high. In addition
to being non-core, which is already a pretty high barrier at a lot of organizations,it requires a shared_preload_libraries setting,
which is pretty close to untenable in a lot of use cases.

We are thinking to move a module in core just because of

"barrier for turning it on is quite high" which is not a very compelling reason. I am just thinking

why not have a system which makes that easy instead of adding to core.

Ibrar Ahmed

Re: Proposal: roll pg_stat_statements into core

From

Stephen Frost

Date:

03 September 2019, 22:56:28

Greetings,

* David Fetter (david@fetter.org) wrote:
> I'd like to $Subject, on by default, with a switch to turn it off for
> those really at the outer edges of performance. Some reasons include:

Sure, half of contrib should really be in core (amcheck, file_fdw,
postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)
but we simply haven't got great facilities for either migrating those
things into core (particularly during an upgrade..) or making them
available directly in a way that isn't intrusive (seems like maybe we
should have an independent schema from pg_catalog that's for things like
this...  utility functions and views over them which are particularly
useful; harkins back to the ancient discussion about a new pg_catalog
structure...  pg new sysviews, or something along those lines?).

Figure out how to fix those issues and maybe there's something
interesting to discuss here, until then, a thread like this is likely to
be unproductive.  A direct patch that just shoves pg_stat_statements
into core isn't going to cut it.

Thanks,

Stephen

Attachment

signature.asc

Re: Proposal: roll pg_stat_statements into core

From

David Fetter

Date:

04 September 2019, 02:38:40

On Tue, Sep 03, 2019 at 03:56:28PM -0400, Stephen Frost wrote:
> Greetings,
> 
> * David Fetter (david@fetter.org) wrote:
> > I'd like to $Subject, on by default, with a switch to turn it off for
> > those really at the outer edges of performance. Some reasons include:
> 
> Sure, half of contrib should really be in core (amcheck, file_fdw,
> postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
> pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)

Agreed.

> but we simply haven't got great facilities for either migrating those
> things into core (particularly during an upgrade..)

So a process that makes transitioning from extension to core in a(n at
least largely) mechanical way sounds like a Useful Thing™.

Would that be worth a separate thread once this CF is over?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Stephen Frost

Date:

04 September 2019, 02:41:00

Greetings,

On Tue, Sep 3, 2019 at 19:38 David Fetter <david@fetter.org> wrote:

On Tue, Sep 03, 2019 at 03:56:28PM -0400, Stephen Frost wrote:
> Greetings,
>
> * David Fetter (david@fetter.org) wrote:
> > I'd like to $Subject, on by default, with a switch to turn it off for
> > those really at the outer edges of performance. Some reasons include:
>
> Sure, half of contrib should really be in core (amcheck, file_fdw,
> postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
> pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)

Agreed.

> but we simply haven't got great facilities for either migrating those
> things into core (particularly during an upgrade..)

So a process that makes transitioning from extension to core in a(n at
least largely) mechanical way sounds like a Useful Thing™.

Would that be worth a separate thread once this CF is over?

A well considered proposal might be interesting. A “call to arms” asking someone to create same likely wouldn’t be.

Thanks,

Stephen

Re: Proposal: roll pg_stat_statements into core

From

David Fetter

Date:

04 September 2019, 02:44:46

On Tue, Sep 03, 2019 at 07:41:00PM -0400, Stephen Frost wrote:
> Greetings,
> 
> On Tue, Sep 3, 2019 at 19:38 David Fetter <david@fetter.org> wrote:
> 
> > On Tue, Sep 03, 2019 at 03:56:28PM -0400, Stephen Frost wrote:
> > > Greetings,
> > >
> > > * David Fetter (david@fetter.org) wrote:
> > > > I'd like to $Subject, on by default, with a switch to turn it off for
> > > > those really at the outer edges of performance. Some reasons include:
> > >
> > > Sure, half of contrib should really be in core (amcheck, file_fdw,
> > > postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
> > > pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)
> >
> > Agreed.
> >
> > > but we simply haven't got great facilities for either migrating those
> > > things into core (particularly during an upgrade..)
> >
> > So a process that makes transitioning from extension to core in a(n at
> > least largely) mechanical way sounds like a Useful Thing™.
> >
> > Would that be worth a separate thread once this CF is over?
> 
> A well considered proposal might be interesting. A “call to arms” asking
> someone to create same likely wouldn’t be.

I was thinking of a PoC (or, ideally, better) for such a system. The
current CF having already started, after it's over seems like a better
time to do it.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Alvaro Herrera

Date:

04 September 2019, 04:38:58

On 2019-Sep-03, Stephen Frost wrote:

> Greetings,
> 
> * David Fetter (david@fetter.org) wrote:
> > I'd like to $Subject, on by default, with a switch to turn it off for
> > those really at the outer edges of performance. Some reasons include:
> 
> Sure, half of contrib should really be in core (amcheck, file_fdw,
> postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
> pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)
> but we simply haven't got great facilities for either migrating those
> things into core (particularly during an upgrade..)

I think there is a false dichotomy here.  Migrating an extension out of
contrib doesn't have to equate making it no longer an extension.  We
could, instead, keep it being an extension, but move it out of contrib
and into (say) src/extensions/ so that it becomes installed and
available by default, but still an extension.  Users won't have to
install a separate contrib package, but they would still have to run
CREATE EXTENSION.

We don't incur the upgrade pain if we do that, for one thing.  Also, we
don't force everybody to have it; and we don't have to invent this
hypothetical switch to turn it off.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Proposal: roll pg_stat_statements into core

From

Stephen Frost

Date:

04 September 2019, 05:11:34

Greetings,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> On 2019-Sep-03, Stephen Frost wrote:
> > * David Fetter (david@fetter.org) wrote:
> > > I'd like to $Subject, on by default, with a switch to turn it off for
> > > those really at the outer edges of performance. Some reasons include:
> >
> > Sure, half of contrib should really be in core (amcheck, file_fdw,
> > postgres_fdw, maybe dblink, pageinspect, pg_buffercache,
> > pg_freespacemap, pgstattuple, pg_visibility, sslinfo, maybe pgtrgm..)
> > but we simply haven't got great facilities for either migrating those
> > things into core (particularly during an upgrade..)
>
> I think there is a false dichotomy here.  Migrating an extension out of
> contrib doesn't have to equate making it no longer an extension.  We
> could, instead, keep it being an extension, but move it out of contrib
> and into (say) src/extensions/ so that it becomes installed and
> available by default, but still an extension.  Users won't have to
> install a separate contrib package, but they would still have to run
> CREATE EXTENSION.
>
> We don't incur the upgrade pain if we do that, for one thing.  Also, we
> don't force everybody to have it; and we don't have to invent this
> hypothetical switch to turn it off.

I don't agree that it's a false dichotomy- from a user experience, you
aren't really changing anything with this approach and the entire point
is that most of these things should just be available in core.  Yes,
maybe we need a way to turn on/off things like pg_stat_statements but
that should have been a runtime "track_stat_statements" or some such,
similar to other things like "track_io_timing", not an "independent"
extension that is actually developed, managed, and released just as core
is.

I also don't buy off, not in the *least*, that we can willy-nilly change
things that pg_stat_statements depends on, or the interface of
pg_stat_statements itself, any more than we can change catalog tables
(which is to say- we can, and do, and people have to deal with it, but
we do get to listen to complaints about it and at times consider if a
change is worth it or not to make).  Having it as an extension doesn't
really change any of that for us, imv.

These aren't external extensions and "contrib" hasn't been what it was
originally designed for in years (see pg_audit if you'd like a
relatively recent example of how contrib isn't really for things that
"address a limited audience or are too experimental to be part of the
main source tree" anymore, despite our README in that directory claiming
otherwise).

I wouldn't be against a proposal that moved contrib into a 'src/modules'
or some such just for maintenance but if we aren't installing the ones
that I've suggested above by (and probably others) by default (and
probably sending the others to the bitbucket because they really aren't
core-ready) then there's really not much point in doing anything here.

The one *good* thing about having extensions is really something that we
*should* have a way to do in core- and that's the upgrade path and
the ability to run an upgrade script sanely.  Maybe we can't get there
for all of our catalogs, but it'd sure be nice if we could figure out a
way to do it for some things so that we break at least a few less things
during upgrades.

Thanks,

Stephen

Attachment

signature.asc

Re: Proposal: roll pg_stat_statements into core

From

Tom Lane

Date:

04 September 2019, 07:14:50

Stephen Frost <sfrost@snowman.net> writes:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> I think there is a false dichotomy here.  Migrating an extension out of
>> contrib doesn't have to equate making it no longer an extension.  We
>> could, instead, keep it being an extension, but move it out of contrib
>> and into (say) src/extensions/ so that it becomes installed and
>> available by default, but still an extension.  Users won't have to
>> install a separate contrib package, but they would still have to run
>> CREATE EXTENSION.

> I don't agree that it's a false dichotomy- from a user experience, you
> aren't really changing anything with this approach and the entire point
> is that most of these things should just be available in core.

I don't think that auto-installing such things requires changing much
of anything.  See plpgsql, which is auto-installed now but still sits
in src/pl/ alongside other PLs that are not auto-installed.  Similarly,
there's nothing really stopping us from choosing to install some
module of contrib by default; rearranging the source tree is not a
prerequisite to that.

The situation with pg_stat_statements is more than that, since what
David is requesting is not merely that we auto-install that extension
but that we automatically push it into shared_preload_libraries.

> ...maybe we need a way to turn on/off things like pg_stat_statements but
> that should have been a runtime "track_stat_statements" or some such,
> similar to other things like "track_io_timing", not an "independent"
> extension that is actually developed, managed, and released just as core
> is.

A key point that hasn't been highlighted in this discussion is that having
pg_stat_statements as an extension is proof-of-concept that such features
*can* be implemented outside of core.  Don't you think that there are
probably people maintaining private variants of that extension, who would
be really sad if we removed or broke APIs they need for it once
pg_stat_statements is part of core?

            regards, tom lane

Re: Proposal: roll pg_stat_statements into core

From

Stephen Frost

Date:

04 September 2019, 08:25:25

Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> >> I think there is a false dichotomy here.  Migrating an extension out of
> >> contrib doesn't have to equate making it no longer an extension.  We
> >> could, instead, keep it being an extension, but move it out of contrib
> >> and into (say) src/extensions/ so that it becomes installed and
> >> available by default, but still an extension.  Users won't have to
> >> install a separate contrib package, but they would still have to run
> >> CREATE EXTENSION.
>
> > I don't agree that it's a false dichotomy- from a user experience, you
> > aren't really changing anything with this approach and the entire point
> > is that most of these things should just be available in core.
>
> I don't think that auto-installing such things requires changing much
> of anything.  See plpgsql, which is auto-installed now but still sits
> in src/pl/ alongside other PLs that are not auto-installed.  Similarly,
> there's nothing really stopping us from choosing to install some
> module of contrib by default; rearranging the source tree is not a
> prerequisite to that.

Sure, I agree with you about all of that- I was just offering it as an
option that if folks felt that rearranging the tree for these
auto-installed things makes sense then, sure, we can do that.

> The situation with pg_stat_statements is more than that, since what
> David is requesting is not merely that we auto-install that extension
> but that we automatically push it into shared_preload_libraries.

Not sure that it'd actually operate in exactly that way, but yes, when
enabled we'd allocate memory the same as if it was in
shared_preload_libraries and such.

> > ...maybe we need a way to turn on/off things like pg_stat_statements but
> > that should have been a runtime "track_stat_statements" or some such,
> > similar to other things like "track_io_timing", not an "independent"
> > extension that is actually developed, managed, and released just as core
> > is.
>
> A key point that hasn't been highlighted in this discussion is that having
> pg_stat_statements as an extension is proof-of-concept that such features
> *can* be implemented outside of core.  Don't you think that there are
> probably people maintaining private variants of that extension, who would
> be really sad if we removed or broke APIs they need for it once
> pg_stat_statements is part of core?

Do I feel that out-of-core private extensions are more valuable than the
run-of-the-mill user who just wants to be able to see what queries are
taking time in their running system?  Perhaps it's a but cut-throat,
but, uh, no, not in the least.  I, at least, don't hack on PostgreSQL
because it's a fantastic platform for closed source/proprietary/third
party solutions to be built upon but rather because it's damn good
open source software that I enjoy working with.  That's part of what
frustrates me about contrib- it was once thing X and it's now quite
clearly thing Y but we are refusing to call it that, and that's just not
how I've become accustomed to this open source project working.  Let's
call a spade a spade.

That said, I do understand the point that integrating these things into
core *might* lead us down a road where we break extension APIs in ways
that maybe we wouldn't have if they were kept as extensions, but really,
anyone who runs into that issue who really cares to be heard should be
as vocal and active as the PostGIS folks are- we hear from them
regularly on our lists and that's how it should be.  I'm not going to be
convinced of the silent-majority argument in this case, given the pain
it causes a great many of our regular users.

... we also have a way to test these hooks/modules in our regression
tests, which we should probably do so that we at least know when we
break them.

Let's put pg_stat_statements aside for a minute- what's the reasoning
behind requireing users to run CREATE EXTENTION to utilize the functions
offered by amcheck?  pgstattuple?  pg_buffercache?  At least with
pg_stat_statements there's good justification, based on performance, for
wanting an on/off switch, but for just about all the other ones, all it
is is a few entries in a catalog table that we're talking about...

Thanks,

Stephen

Attachment

signature.asc

Re: Proposal: roll pg_stat_statements into core

From

David Fetter

Date:

04 September 2019, 09:16:36

On Wed, Sep 04, 2019 at 12:14:50AM -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> >> I think there is a false dichotomy here.  Migrating an extension out of
> >> contrib doesn't have to equate making it no longer an extension.  We
> >> could, instead, keep it being an extension, but move it out of contrib
> >> and into (say) src/extensions/ so that it becomes installed and
> >> available by default, but still an extension.  Users won't have to
> >> install a separate contrib package, but they would still have to run
> >> CREATE EXTENSION.
> 
> > I don't agree that it's a false dichotomy- from a user experience, you
> > aren't really changing anything with this approach and the entire point
> > is that most of these things should just be available in core.
> 
> I don't think that auto-installing such things requires changing much
> of anything.  See plpgsql, which is auto-installed now but still sits
> in src/pl/ alongside other PLs that are not auto-installed.  Similarly,
> there's nothing really stopping us from choosing to install some
> module of contrib by default; rearranging the source tree is not a
> prerequisite to that.
> 
> The situation with pg_stat_statements is more than that, since what
> David is requesting is not merely that we auto-install that extension
> but that we automatically push it into shared_preload_libraries.

What I am actually suggesting is that it not be a separate library at
all, i.e. that all its parts live in PostgreSQL proper.

> > ...maybe we need a way to turn on/off things like pg_stat_statements but
> > that should have been a runtime "track_stat_statements" or some such,
> > similar to other things like "track_io_timing", not an "independent"
> > extension that is actually developed, managed, and released just as core
> > is.
> 
> A key point that hasn't been highlighted in this discussion is that having
> pg_stat_statements as an extension is proof-of-concept that such features
> *can* be implemented outside of core.  Don't you think that there are
> probably people maintaining private variants of that extension, who would
> be really sad if we removed or broke APIs they need for it once
> pg_stat_statements is part of core?

Nowhere near the number of people who are inconvenienced, at a
minimum, by the high barriers needed in order to install and use it in
its current form.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Proposal: roll pg_stat_statements into core

From

Andres Freund

Date:

04 September 2019, 12:34:03

Hi,

On 2019-09-02 12:07:17 -0400, Tom Lane wrote:
> Euler Taveira <euler@timbira.com.br> writes:
> > At least if pg_stat_statements
> > was in core you could load it by default and have a GUC to turn it
> > on/off without restarting the server (that was Magnus proposal and
> > Andres agreed).
> 
> That assertion is 100% bogus.  To turn it on or off on-the-fly,
> you'd need some way to acquire or release its shared memory
> on-the-fly.

Not necessarily - in practice the issue of pg_stat_statements having
overhead isn't about the memory overhead, but the CPU overhead. And that
we can nearly entirely disable/enable without acquiring / releasing
shared memory already, by setting pg_stat_statements.track to NONE.

There's still the overhead of being indirected through the pgss_* hooks,
which then have to figure out that pgss is disabled. But that's
obviously much smaller than the overhead of pgss itself.  Although I do
wonder if it's time that we make hook registration/unregistration a bit
more flexible - it's basically impossible to unregister hooks right now
because the chain of hooks is distributed over multiple files. If we
instead provided a few helper functions to register hooks we could allow
deregistering as well.

> It's probably now possible to do something like that, using the
> DSM mechanisms, but the code for it hasn't been written (AFAIK).
> And it wouldn't have much to do with whether the module was
> in core or stayed where it is.

I think it'd be good to add DSM support for pg_stat_statements - but as
you say that's largely independent of it being mainlined.

> Another concern that I have about moving pg_stat_statements
> into core is that doing so would effectively nail down
> Query.queryId as belonging to pg_stat_statements, whereas
> currently it's possible for other plugins to commandeer that
> if they wish.  This isn't academic, because of the fact that
> not everybody is satisfied with the way pg_stat_statements
> defines queryId [1].

I'm not sure I see this as that big an issue - we could have a few
different query ids for each query. If it's a problem that there's
potentially different desirable semantics for queryId, then it's also an
issue that we can have only one.

Greetings,

Andres Freund

Re: Proposal: roll pg_stat_statements into core

From

Alvaro Herrera from 2ndQuadrant

Date:

05 September 2019, 19:07:55

On 2019-Sep-02, Euler Taveira wrote:

> At least if pg_stat_statements
> was in core you could load it by default and have a GUC to turn it
> on/off without restarting the server (that was Magnus proposal and
> Andres agreed). I support this idea.

Actually this is possible without moving to part of core, too -- change
things so that putting it in shared_preload_libraries reserves the shmem
area but does not enable collection, then have a separate GUC that
enables/disables collection.  A bit like archive_mode=on,
archive_command=/bin/true, which allows to change archive_command
without restarting.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services