Thread: Extensions, patch v18 (merge against master, bitrot-only-fixes)

Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Hi,

Well $subject says about it all really. The bitrot of course comes from
the fact that the last in-commitfest-dependency has been commited in,
and I kept a version of pg_execute_sql_file() in the extension's patch.

Following the discussions, and as exposing this function is not
necessary for CREATE EXTENSION to work as needed, it's not exposed at
the SQL level anymore.

Please note that the SQL scripts seem to be encoded in latin9. I don't
think there's a policy about that already, and I'd like to get
confirmation about that before to go ahead and add encoding = latin9 to
the control files.  Whatever the master sources encoding policy is, I
think explicitely marking it in the control files will be a good idea.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Robert Haas
Date:
On Thu, Dec 16, 2010 at 7:49 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Please note that the SQL scripts seem to be encoded in latin9.

Seems like an odd choice.  Why not UTF-8?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:

> On Thu, Dec 16, 2010 at 7:49 AM, Dimitri Fontaine
> <dimitri@2ndquadrant.fr> wrote:
>> Please note that the SQL scripts seem to be encoded in latin9.
>
> Seems like an odd choice.  Why not UTF-8?

Not a choice, just what's already in…

--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Robert Haas
Date:
On Thu, Dec 16, 2010 at 9:04 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Thu, Dec 16, 2010 at 7:49 AM, Dimitri Fontaine
>> <dimitri@2ndquadrant.fr> wrote:
>>> Please note that the SQL scripts seem to be encoded in latin9.
>>
>> Seems like an odd choice.  Why not UTF-8?
>
> Not a choice, just what's already in…

Sure, I get it.  I'm guessing that many of the scripts will work in a
wide variety of encodings because they're a subset of ASCII.  Should
we think about converting the others to UTF-8, or is that a bad idea?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Nicolas Barbier
Date:
2010/12/16 Robert Haas <robertmhaas@gmail.com>:

> On Thu, Dec 16, 2010 at 7:49 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>
>> Please note that the SQL scripts seem to be encoded in latin9.
>
> Seems like an odd choice.  Why not UTF-8?

Latin 9 = ISO 8859-15 = a more modern version of Latin 1 (ISO 8859-1),
which includes the € symbol for example. I.e., it's not that weird. As
long as there are no non-ASCII characters, it seems even likely that
some tools might detect a simple ASCII text file as Latin 9 by
default.

Nicolas


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Dec 16, 2010 at 9:04 AM, Dimitri Fontaine
> <dimitri@2ndquadrant.fr> wrote:
>>>> Please note that the SQL scripts seem to be encoded in latin9.

>>> Seems like an odd choice. �Why not UTF-8?

>> Not a choice, just what's already in�

> Sure, I get it.  I'm guessing that many of the scripts will work in a
> wide variety of encodings because they're a subset of ASCII.  Should
> we think about converting the others to UTF-8, or is that a bad idea?

I would think that we want to establish the same policy as we have for
dictionary files: they're assumed to be UTF-8.  I don't believe there
should be an encoding option at all.  If we didn't need one for
dictionary files, there is *surely* no reason why we have to have one
for extension SQL files.
        regards, tom lane


Tom Lane <tgl@sss.pgh.pa.us> writes:
> I would think that we want to establish the same policy as we have for
> dictionary files: they're assumed to be UTF-8.  I don't believe there
> should be an encoding option at all.  If we didn't need one for
> dictionary files, there is *surely* no reason why we have to have one
> for extension SQL files.

Brain-fart from me in the v18, that I've produced while being pressed by
other distractions. Fixed now in the attached, v19. Sorry about that, I
was too eager to produce the no-moving-parts "final" patch. Time to find
this quote about parallelism and time lost I guess.

So, attached patch fixes the v18 regression wrt to script file encoding
and establish UTF-8 as the default encoding to consider to read a script
file. Thanks for your comments.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
"David E. Wheeler"
Date:
On Dec 16, 2010, at 8:19 AM, Tom Lane wrote:

> I would think that we want to establish the same policy as we have for
> dictionary files: they're assumed to be UTF-8.  I don't believe there
> should be an encoding option at all.  If we didn't need one for
> dictionary files, there is *surely* no reason why we have to have one
> for extension SQL files.

+1 KISS.

David



Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Alvaro Herrera
Date:
Excerpts from Dimitri Fontaine's message of jue dic 16 09:49:31 -0300 2010:
> Hi,
> 
> Well $subject says about it all really. The bitrot of course comes from
> the fact that the last in-commitfest-dependency has been commited in,
> and I kept a version of pg_execute_sql_file() in the extension's patch.

I thought the suggestion of having "version = 9.1devel" line in each
contrib's module extension file was a joke.  It is certainly not going
to be sustainable in the long run -- I don't think we want to be
modifying all control files each release.  We need to find a better way
to fix this.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I thought the suggestion of having "version = 9.1devel" line in each
> contrib's module extension file was a joke.  It is certainly not going
> to be sustainable in the long run -- I don't think we want to be
> modifying all control files each release.  We need to find a better way
> to fix this.

+1.

Naively enough, getting this from the Makefile looked obvious to me.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I thought the suggestion of having "version = 9.1devel" line in each
> contrib's module extension file was a joke.  It is certainly not going
> to be sustainable in the long run -- I don't think we want to be
> modifying all control files each release.  We need to find a better way
> to fix this.

Consider also the use case where we only update contrib module minor
version when they do receive an update in the tree. If we get to do
that, maybe it's good enough with plain .control files.

Of course having some Makefile or other support doesn't prevent anything
here, it just allows some more flexibility. That was considered some
more complexity not solving any real world use-case on-list, though.

FWIW, it seems very likely that should we ship without the Make support
I'll have to add it to every extension I maintain, separately.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> I thought the suggestion of having "version = 9.1devel" line in each
>> contrib's module extension file was a joke.  It is certainly not going
>> to be sustainable in the long run -- I don't think we want to be
>> modifying all control files each release.  We need to find a better way
>> to fix this.

> Naively enough, getting this from the Makefile looked obvious to me.

Putting those numbers in the Makefile instead of the control file surely
does nothing to alleviate Alvaro's maintenance concern.

However, the only way I can see to fix this "automatically" is to have
the makefiles propagate PG_VERSION_NUM (or one of the other values set
by configure) into generated control files.  I don't think that's what
we want either.  If we do that, then people are going to be forced to
go through an ALTER EXTENSION UPGRADE cycle whether or not anything
actually changed in the extension's SQL definitions.  We really only
want the extension's SQL version to change when there was a meaningful
change in the SQL commands.

I'm not sure that I see a better answer than hand-maintained version
numbers in each extension SQL file.  But if that's where we're going,
they should be in the SQL files, not in either the Makefiles or control
files.
        regards, tom lane


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> However, the only way I can see to fix this "automatically" is to have
> the makefiles propagate PG_VERSION_NUM (or one of the other values set
> by configure) into generated control files.

Ah, somewhat like what I was asked to remove from the patch, right?

-EXTVERSION = $(VERSION)

>  I don't think that's what
> we want either.  If we do that, then people are going to be forced to
> go through an ALTER EXTENSION UPGRADE cycle whether or not anything
> actually changed in the extension's SQL definitions.  We really only
> want the extension's SQL version to change when there was a meaningful
> change in the SQL commands.

Well that's just a facility, and very useful for people that have a Make
variable where the extension's version is already held. This problem of
having a VERSION that moves when the extension possibly didn't change
seems pretty specific to PostgreSQL, not typical for extensions authors.

> I'm not sure that I see a better answer than hand-maintained version
> numbers in each extension SQL file.  But if that's where we're going,
> they should be in the SQL files, not in either the Makefiles or control
> files.

You lost me. Are you wanting extension authors to maintain meta data in
the SQL script, with some new syntax support, or maybe from calling a
function? How do you think this will play with upgrading?

That doesn't sound any simpler to me. The whole goal of the control file
is to be able to register the extension in the catalog and get an OID
there to manage the dependencies (so that we get a single command per
extension in pg_dump output). The first design proposed an SQL command
to fill in the catalog, but that command must be separate from what the
DBA will have to type in to have the server execute the script, so this
idea has been beaten by the control file idea (thanks goes to Heikki).

Now, we could go with the current patch for starters and add some
facilities around 9.2 when we have experienced the maintenance
burden. And extension authors will probably be asking for some more
facilities too, by then.


Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Alvaro Herrera
Date:
Excerpts from Tom Lane's message of jue dic 16 17:10:10 -0300 2010:

> However, the only way I can see to fix this "automatically" is to have
> the makefiles propagate PG_VERSION_NUM (or one of the other values set
> by configure) into generated control files.  I don't think that's what
> we want either.  If we do that, then people are going to be forced to
> go through an ALTER EXTENSION UPGRADE cycle whether or not anything
> actually changed in the extension's SQL definitions.  We really only
> want the extension's SQL version to change when there was a meaningful
> change in the SQL commands.

I've been wondering if we should stop thinking that each contrib's
module version is the same as the backend version.  What if we declare
them all 1.0.0 for the 9.1 release, and increment the version manually
each time there's a change?  So a module that stays unchanged for 9.2
would still be 1.0.0.  (The .so file would still need to be recompiled
for the new server version, because of PG_MODULE_MAGIC.)

In that case, having hand-maintained version numbers in control or
wherever is not so much of a problem; the committer that touches the
module needs to ensure that the version number is incremented too.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Tom Lane's message of jue dic 16 17:10:10 -0300 2010:
>> However, the only way I can see to fix this "automatically" is to have
>> the makefiles propagate PG_VERSION_NUM (or one of the other values set
>> by configure) into generated control files.  I don't think that's what
>> we want either.  If we do that, then people are going to be forced to
>> go through an ALTER EXTENSION UPGRADE cycle whether or not anything
>> actually changed in the extension's SQL definitions.  We really only
>> want the extension's SQL version to change when there was a meaningful
>> change in the SQL commands.

> I've been wondering if we should stop thinking that each contrib's
> module version is the same as the backend version.

Right, they would have to be decoupled if you believe what I said above.

> In that case, having hand-maintained version numbers in control or
> wherever is not so much of a problem; the committer that touches the
> module needs to ensure that the version number is incremented too.

It would be tolerable, if perhaps error-prone.  But we've seldom blown
it on catversion, and this would be a comparable requirement.
        regards, tom lane


Re: Extensions, patch v18 (merge against master, bitrot-only-fixes)

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> However, the only way I can see to fix this "automatically" is to have
>> the makefiles propagate PG_VERSION_NUM (or one of the other values set
>> by configure) into generated control files.
>
> Ah, somewhat like what I was asked to remove from the patch, right?

I've been pointed off-list that this message ain't conveying the meaning
I'm attaching it, sorry about that. What I mean is that should we change
our opinion again the code to do that has already been written.

Allow me to insist on "we": it's not like I feel forced into changing
the design again and again, I realise I'm acting eagerly upon group
decision making steps before to ensure it's the final step. Well, we all
have been reading over-stressed exchanges on this list before, right? :)

Will now step back on the topic, or try to...

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


On Fri, Dec 17, 2010 at 02:00, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> So, attached patch fixes the v18 regression wrt to script file encoding
> and establish UTF-8 as the default encoding to consider to read a script
> file. Thanks for your comments.

You probably compared wrong versions of source trees. The patch contains
many diffs not related to EXTENSION. It cannot be applied cleanly.

BTW, only earthdistance.sql.in has @extschema@.
Didn't you forget to remove it?

--- b/contrib/earthdistance/earthdistance.sql.in
! SET search_path = @extschema@;


I've not read the patch well yet, but I'd like to make sure of
two specs in the patch:

#1. I found the patch specifies "version" in each control file. We will
need to maintain them manually, but I assume it was a conclusion in the
discussion for v18 patch. So, no more changes are required here.

--- b/contrib/XXX/XXX.control
+ version = '9.1devel'

#2. The patch replaces "\i XXX.sql" to "CREATE EXTENSION XXX". They are a
bit different because CREATE EXTENSION uses the installed sql files instead
of the source directory. But I think this is the correct behavior. We should
have used only installed files because they are used in "make *installcheck*".

*** a/contrib/XXX/sql/XXX.sql
*************** SET client_min_messages = warning;
! \set ECHO none
! \i XXX.sql
! \set ECHO all RESET client_min_messages;
--- 7,13 ---- SET client_min_messages = warning;
! CREATE EXTENSION XXX; RESET client_min_messages;

-- 
Itagaki Takahiro


Re: Extensions, patch v19 (encoding brainfart fix)

From
Dimitri Fontaine
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> You probably compared wrong versions of source trees. The patch contains
> many diffs not related to EXTENSION. It cannot be applied cleanly.

Ouch, sorry about that, will revise soon'ish.  Meanwhile the git repo is
still available here:
 http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension

> BTW, only earthdistance.sql.in has @extschema@.
> Didn't you forget to remove it?

No. It so happens that earthdistance depends on cube and that we agreed
on not implement dependencies between extensions for the first patch.
The problem here is that ALTER EXTENSION earthdistance SET SCHEMA foo;
would relocate cube objects too, because there's nothing to stop the
recursive walking at the right spot.  So earthdistance.control states
that earthdistance is not relocatable for now.

And a non relocatable extension will use @extschema@ so that users still
get a say in where to install it…

> I've not read the patch well yet, but I'd like to make sure of
> two specs in the patch:
>
> #1. I found the patch specifies "version" in each control file. We will
> need to maintain them manually, but I assume it was a conclusion in the
> discussion for v18 patch. So, no more changes are required here.

Exactly, the consensus seems to be that we *want* to maintain separate
versions number for each contrib.

> #2. The patch replaces "\i XXX.sql" to "CREATE EXTENSION XXX". They are a
> bit different because CREATE EXTENSION uses the installed sql files instead
> of the source directory. But I think this is the correct behavior. We should
> have used only installed files because they are used in "make *installcheck*".

Yeah.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
>> You probably compared wrong versions of source trees. The patch contains
>> many diffs not related to EXTENSION. It cannot be applied cleanly.
>
> Ouch, sorry about that, will revise soon'ish.  Meanwhile the git repo is
> still available here:
>
>   http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension

Here it is, attached. The problem was of course due to git branches and
missing local pulling and merging. Going gradually from 7 to 2 branches
causes that, sometimes.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment
On Sat, Dec 18, 2010 at 2:39 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Here it is, attached. The problem was of course due to git branches and
> missing local pulling and merging. Going gradually from 7 to 2 branches
> causes that, sometimes.

I spent a little time looking at this tonight.  I'm going to give you
the same general advice that I've given other people who have
submitted very large patches of this type: it'll be a lot easier to
get this committed if we can break it into smaller pieces.  A pretty
easy thing to do would be to separate the changes that turn the
existing contrib modules into extensions into its own patch.  A
possibly more controversial idea is to actually remove the notion of a
relocatable extension from this patch, and all the supporting
machinery, and make that a follow-on patch.  I think it was arranged
like that before and somehow got collapsed, but maybe the notion is
worth revisiting, because this is a lot of code:
224 files changed, 4713 insertions(+), 293 deletions(-)

That issue aside, I think there is still a fair amount of cleanup and
code review that needs to happen here before this can be considered
for commit.  Here are a few things I noticed on a first read-through:

- pg_execute_sql_string() and pg_execute_sql_file() are documented,
but are not actually part of the patch.

- The description of the NO USER DATA option is almost content-free.
It basically says "this option is here because it wouldn't work if we
didn't have this option".

- listDependentObjects() appears to be largely cut-and-pasted from
some other function.  It calls AcquireDeletionLock() and the comments
mention constructing a list of objects to delete.  It sure doesn't
seem right to be acquiring AccessExclusiveLocks on lots of things in a
function that's there to support the pg_extension_objects SRF.

- This bit certainly violates our message style guidelines:

+       elog(NOTICE,
+                "Installing extension '%s' from '%s', in schema %s,
%s user data",
+                stmt->extname,
+                get_extension_absolute_path(control->script),
+                schema ? schema : "public",
+                create_extension_with_user_data ? "with" : "without");

User-facing messages need to be ereport, not elog, so they can be
translated, and mustn't be assembled from pieces, as you've done with
"%s user data".

- Has the issue of changing custom_variable_classes from PGC_SIGHUP to
PGC_SUSET been discussed?  I am not sure whether that's an OK thing to
do.  If it is OK, then the documentation also needs updating.

- This comment looks like leftovers:

+       /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */
+       SetConfigOption("custom_variable_classes",
+                                       newval, PGC_SIGHUP, PGC_S_EXTENSION);

Apologies if I missed the previous discussion of this, but why are we
adding a new GUC context?

- Did we decide to ditch the encoding parameter for extension scripts
and mandate UTF-8?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Dec 18, 2010, at 7:04 PM, Robert Haas wrote:

> - Did we decide to ditch the encoding parameter for extension scripts
> and mandate UTF-8?

+1

It was certainly suggested. I think it's a good idea, at least with a first cut.

Best,

David

On Sat, Dec 18, 2010 at 10:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> - Has the issue of changing custom_variable_classes from PGC_SIGHUP to
> PGC_SUSET been discussed?  I am not sure whether that's an OK thing to
> do.  If it is OK, then the documentation also needs updating.
>
> - This comment looks like leftovers:
>
> +       /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */
> +       SetConfigOption("custom_variable_classes",
> +                                       newval, PGC_SIGHUP, PGC_S_EXTENSION);
>
> Apologies if I missed the previous discussion of this, but why are we
> adding a new GUC context?

Looking at this a little more, I am inclined to think that
ExtensionSetCVC() is entirely unacceptable.  Our backend startup is
high enough already.  Sequential scanning the pg_extension catalog on
every startup to spare the DBA the trouble of setting up
postgresql.conf strikes me as a bad trade-off.  If you were to come
back and say that custom_variable_classes is a vile hack from the
darkest reaches of wherever vile hacks originate from, I would agree
with you.  Having a GUC that controls what names can be used as GUCs
is probably a bad design, and I'd like to come up with something
better.  But I don't think this is it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions, patch v20 (bitrot fixes)

From
Dimitri Fontaine
Date:
Hi,

Thanks for your review and your time. Trying to answer some of your
points there:

Robert Haas <robertmhaas@gmail.com> writes:
> I spent a little time looking at this tonight.  I'm going to give you
> the same general advice that I've given other people who have
> submitted very large patches of this type: it'll be a lot easier to
> get this committed if we can break it into smaller pieces.  A pretty
> easy thing to do would be to separate the changes that turn the
> existing contrib modules into extensions into its own patch.

Those are pretty mechanical, so applying them after would be quite
easy. That said I don't know exactly how much it would ease the review,
but if you say it does, I'll separate the patches.

>  A
> possibly more controversial idea is to actually remove the notion of a
> relocatable extension from this patch, and all the supporting
> machinery, and make that a follow-on patch.  I think it was arranged
> like that before and somehow got collapsed, but maybe the notion is
> worth revisiting, because this is a lot of code:
>
>  224 files changed, 4713 insertions(+), 293 deletions(-)

It was like that before indeed. The problem is how to reliably implement
the CREATE EXTENSION WITH SCHEMA, which is like the second best thing
about all this infrastructure work just behind the pg_dump support.  I'm
not clear that this option should come on its own in a later patch or
that doing so simplifies anything, but not opinionated about that: let's
organise ourselves as best as we are able to.

Again, will split the code in a third patch if that's what's necessary.

> That issue aside, I think there is still a fair amount of cleanup and
> code review that needs to happen here before this can be considered
> for commit.  Here are a few things I noticed on a first read-through:
>
> - pg_execute_sql_string() and pg_execute_sql_file() are documented,
> but are not actually part of the patch.

Ouch, git ain't that magic after all. Will clean-up next version(s).

> - The description of the NO USER DATA option is almost content-free.
> It basically says "this option is here because it wouldn't work if we
> didn't have this option".

+         (see <xref linkend="functions-extension">), this option allow
+         extension authors to prepare installation scripts that will avoid
+         creating user data when restoring.

I'd need some specifics here to be able to do much better. My guess is
that an example is what needed, and it would fit in the main section of
the manual about it, 35.16. Putting it all together: create extension.

> - listDependentObjects() appears to be largely cut-and-pasted from
> some other function.  It calls AcquireDeletionLock() and the comments
> mention constructing a list of objects to delete.  It sure doesn't
> seem right to be acquiring AccessExclusiveLocks on lots of things in a
> function that's there to support the pg_extension_objects SRF.

Ah well, true. AccessShareLock seems the minimum we can take, as we
surely want to prevent the extension and its objects disappearing under
us, right?

> - This bit certainly violates our message style guidelines:
>
> +       elog(NOTICE,
> +                "Installing extension '%s' from '%s', in schema %s,
> %s user data",
> +                stmt->extname,
> +                get_extension_absolute_path(control->script),
> +                schema ? schema : "public",
> +                create_extension_with_user_data ? "with" : "without");
>
> User-facing messages need to be ereport, not elog, so they can be
> translated, and mustn't be assembled from pieces, as you've done with
> "%s user data".

This message is pretty useful for debug and review of the patch, but I'm
not sure we want to keep after that...

> - Has the issue of changing custom_variable_classes from PGC_SIGHUP to
> PGC_SUSET been discussed?  I am not sure whether that's an OK thing to
> do.  If it is OK, then the documentation also needs updating.

I've been talking about it since the very firsts versions of the patch
on this list but didn't receive much answer. You have a detailed mail
about that later in the thread, will start a new thread about that from
there.

> - This comment looks like leftovers:
>
> +       /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */
> +       SetConfigOption("custom_variable_classes",
> +                                       newval, PGC_SIGHUP, PGC_S_EXTENSION);
>
> Apologies if I missed the previous discussion of this, but why are we
> adding a new GUC context?

Well we're using PGC_SIGHUP and PGC_S_EXTENSION, should we use
PGC_EXTENSION too is what the comment asks, so it's not very clear but
not a leftover. We're adding a new GUC source here, not a new context.

Let's include that in the new thread about cvc, though.

> - Did we decide to ditch the encoding parameter for extension scripts
> and mandate UTF-8?

No we didn't, we decided that the default encoding is UTF-8 and that any
contrib script defaults to UTF-8, so that it's not necessary to care
about the 'encoding' parameter in the control file there.

Itagaki required that the extension code is encoding aware, and it
wasn't clear that the control file 'encoding' parameter would be enough
in his side of the world, so the encoding support is currently offered
to authors in the control file and users still can override it in the
CREATE EXTENSION command.

I'm about sure that we don't want to remove that support, and I think
that the command part of it ain't required, but that's for people with
complex encoding problems to tell us IMO.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Extensions and custom_variable_classes (was: Extensions, patch v20 (bitrot fixes))

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Looking at this a little more, I am inclined to think that
> ExtensionSetCVC() is entirely unacceptable.  Our backend startup is
> high enough already.  Sequential scanning the pg_extension catalog on
> every startup to spare the DBA the trouble of setting up
> postgresql.conf strikes me as a bad trade-off.

That seems like a reasonable viewpoint, but...

>  If you were to come
> back and say that custom_variable_classes is a vile hack from the
> darkest reaches of wherever vile hacks originate from, I would agree
> with you.  Having a GUC that controls what names can be used as GUCs
> is probably a bad design, and I'd like to come up with something
> better.  But I don't think this is it.

... users IMO should not be concerned with custom_variable_classes at
all. The only users that know about it have already written an extension
in C. If some are using it in another context then even with my edits
they still can do so.

Now, for people following but not reading the patch, what's in is that
in order for extensions using custom_variable_classes to work without
the user having to care about it, I've added an step at backend startup
time to seqscan pg_extension and update custom_variable_classes from
this catalog.

When adding new entries to custom_variable_classes, known invalid
placeholders used to raise an error, that's no longer the case, we keep
them aside, as this comment in guc-file.l tells:
 * 1. The class name is not valid according to the (new) setting * of custom_variable_classes.  If so, we would reject,
butto * support custom_variable_classes being PGC_SUSET, we maintain * a list of not-yet-valid items. * * This list
willbe processed at assign_custom_variable_classes * time: each time cvc changes, we have candidates to * consider.
Sucha time is for example ExtensionSetCVC() from * backend post init, where we read additional cvc in * pg_extension
catalog.

Now, I can see this mechanism evolving in several directions. Either we
remove it entirely, or we add a list ok known custom_variable_classes
and SINVAL support for it so that as soon as the setting changes, all
concurrent backends know about it, and so that you have a catalog cache
to walk through rather than a real catalog to seqscan and backend init.

Other ideas?
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v20 (bitrot fixes)

From
Robert Haas
Date:
On Sun, Dec 19, 2010 at 5:30 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I spent a little time looking at this tonight.  I'm going to give you
>> the same general advice that I've given other people who have
>> submitted very large patches of this type: it'll be a lot easier to
>> get this committed if we can break it into smaller pieces.  A pretty
>> easy thing to do would be to separate the changes that turn the
>> existing contrib modules into extensions into its own patch.
>
> Those are pretty mechanical, so applying them after would be quite
> easy. That said I don't know exactly how much it would ease the review,
> but if you say it does, I'll separate the patches.

Well, I don't know who is ultimately going to end up applying this.  I
can only say what helps me.  If somebody else wants to jump in here
and say they're good to review and commit the whole thing in one go,
so be it...

>>  A
>> possibly more controversial idea is to actually remove the notion of a
>> relocatable extension from this patch, and all the supporting
>> machinery, and make that a follow-on patch.  I think it was arranged
>> like that before and somehow got collapsed, but maybe the notion is
>> worth revisiting, because this is a lot of code:
>>
>>  224 files changed, 4713 insertions(+), 293 deletions(-)
>
> It was like that before indeed. The problem is how to reliably implement
> the CREATE EXTENSION WITH SCHEMA, which is like the second best thing
> about all this infrastructure work just behind the pg_dump support.  I'm
> not clear that this option should come on its own in a later patch or
> that doing so simplifies anything, but not opinionated about that: let's
> organise ourselves as best as we are able to.

I'm not totally certain of it either, but I think it might.

>> - The description of the NO USER DATA option is almost content-free.
>> It basically says "this option is here because it wouldn't work if we
>> didn't have this option".
>
> +         (see <xref linkend="functions-extension">), this option allow
> +         extension authors to prepare installation scripts that will avoid
> +         creating user data when restoring.
>
> I'd need some specifics here to be able to do much better. My guess is
> that an example is what needed, and it would fit in the main section of
> the manual about it, 35.16. Putting it all together: create extension.

No, I think you need to describe what the option actually does, as
opposed to what problem it solves.

>> - listDependentObjects() appears to be largely cut-and-pasted from
>> some other function.  It calls AcquireDeletionLock() and the comments
>> mention constructing a list of objects to delete.  It sure doesn't
>> seem right to be acquiring AccessExclusiveLocks on lots of things in a
>> function that's there to support the pg_extension_objects SRF.
>
> Ah well, true. AccessShareLock seems the minimum we can take, as we
> surely want to prevent the extension and its objects disappearing under
> us, right?

On two minutes thought, I'm not entirely sure that we need to take
locks at all here, but if we do, AccessShareLock is certainly the
right one.  One idea I have is that we might want to move
AcquireDeletionLock to objectaddress.c or maybe even somewhere in
storage/lmgr, rename it to something like LockObjectByAddress, and
give it a second argument for the lock strength.

>> - This bit certainly violates our message style guidelines:
>>
>> +       elog(NOTICE,
>> +                "Installing extension '%s' from '%s', in schema %s,
>> %s user data",
>> +                stmt->extname,
>> +                get_extension_absolute_path(control->script),
>> +                schema ? schema : "public",
>> +                create_extension_with_user_data ? "with" : "without");
>>
>> User-facing messages need to be ereport, not elog, so they can be
>> translated, and mustn't be assembled from pieces, as you've done with
>> "%s user data".
>
> This message is pretty useful for debug and review of the patch, but I'm
> not sure we want to keep after that...

Either fix it or take it out.  If you're proposing this for commit, it
shouldn't contain anything you don't want committed.

>> - Has the issue of changing custom_variable_classes from PGC_SIGHUP to
>> PGC_SUSET been discussed?  I am not sure whether that's an OK thing to
>> do.  If it is OK, then the documentation also needs updating.
>
> I've been talking about it since the very firsts versions of the patch
> on this list but didn't receive much answer. You have a detailed mail
> about that later in the thread, will start a new thread about that from
> there.

Sorry, I missed that discussion.  As you know I try to follow -hackers
pretty closely, but apparently not closely enough in this case.  Can
you provide links to the archives?

>> - This comment looks like leftovers:
>>
>> +       /* FIXME: add PGC_EXTENSION so that we don't abuse PGC_SIGHUP here? */
>> +       SetConfigOption("custom_variable_classes",
>> +                                       newval, PGC_SIGHUP, PGC_S_EXTENSION);
>>
>> Apologies if I missed the previous discussion of this, but why are we
>> adding a new GUC context?
>
> Well we're using PGC_SIGHUP and PGC_S_EXTENSION, should we use
> PGC_EXTENSION too is what the comment asks, so it's not very clear but
> not a leftover. We're adding a new GUC source here, not a new context.
>
> Let's include that in the new thread about cvc, though.

OK.

>> - Did we decide to ditch the encoding parameter for extension scripts
>> and mandate UTF-8?
>
> No we didn't, we decided that the default encoding is UTF-8 and that any
> contrib script defaults to UTF-8, so that it's not necessary to care
> about the 'encoding' parameter in the control file there.
>
> Itagaki required that the extension code is encoding aware, and it
> wasn't clear that the control file 'encoding' parameter would be enough
> in his side of the world, so the encoding support is currently offered
> to authors in the control file and users still can override it in the
> CREATE EXTENSION command.
>
> I'm about sure that we don't want to remove that support, and I think
> that the command part of it ain't required, but that's for people with
> complex encoding problems to tell us IMO.

Oh, I wasn't aware that Itagaki-san had objected to Tom's proposal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions, patch v20 (bitrot fixes)

From
Itagaki Takahiro
Date:
>>> - Did we decide to ditch the encoding parameter for extension scripts
>>> and mandate UTF-8?
>>
>> No we didn't, we decided that the default encoding is UTF-8 and that any
>> contrib script defaults to UTF-8, so that it's not necessary to care
>> about the 'encoding' parameter in the control file there.
>>
>> Itagaki required that the extension code is encoding aware, and it
>> wasn't clear that the control file 'encoding' parameter would be enough
>> in his side of the world, so the encoding support is currently offered
>> to authors in the control file and users still can override it in the
>> CREATE EXTENSION command.

I think 'encoding' parameter is enough. Of course embedding encoding
specifiers in SQL files are better, but there would be technical problems.
(Just for reference: http://www.python.org/dev/peps/pep-0263/ )

>> I'm about sure that we don't want to remove that support, and I think
>> that the command part of it ain't required, but that's for people with
>> complex encoding problems to tell us IMO.
>
> Oh, I wasn't aware that Itagaki-san had objected to Tom's proposal.

I agree that "the default encoding is UTF-8", but it should be
configurable by the 'encoding' parameter in control files.

If we use UTF-8 as the the default encoding, we need to treat
3 encodings at once (server, client, and UTF-8) anyway.
So, I think no additional complexity will be added even if we
support a  configurable encoding as the third encoding.

-- 
Itagaki Takahiro


On Sun, Dec 19, 2010 at 5:41 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> ... users IMO should not be concerned with custom_variable_classes at
> all. The only users that know about it have already written an extension
> in C.

I agree with the goal.

> Now, I can see this mechanism evolving in several directions. Either we
> remove it entirely, or we add a list ok known custom_variable_classes
> and SINVAL support for it so that as soon as the setting changes, all
> concurrent backends know about it, and so that you have a catalog cache
> to walk through rather than a real catalog to seqscan and backend init.

It'd certainly be better to jigger this so that we do syscache lookups
for each CVC that is actually needed rather than seq-scanning the
whole catalog.  Even that is going to be slower than the present
method, but at least the overhead is proportional to the number of
CVCs you're actually using - and in particular, it's zero if no CVCs
are in use, which must be regarded as one of the common cases.

But I'm not sure that by itself is enough to save this proposal.
Right now, custom_variable_classes is a server-wide setting.  But what
happens if you install an extension into database A and then set a
related configuration parameter in postgresql.conf?  When connecting
to database A, things are OK.  But as soon as someone tries to connect
to database B, the wheels come off.  Either the connection fails
(which seems pretty undesirable) or it works but we don't process
those GUCs (which is outright broken if the extension requires that
every backend see the same value - think for example of a
PGC_POSTMASTER option controlling shared memory allocation).

My gut feeling at the moment is that, given enough time and thought,
there may well be a way to work through all of this and come up with a
design that works.  But we're pretty tight on time right now, and even
if we had a perfect design today, I think I'd still argue for putting
it in a separate patch.  I think that the value of extensions is first
and foremost that they can simplify installing, removing, dumping, and
restoring extensions.  I'd rather see us nail that, and then worry
about custom_variable_classes as a separate issue, likely for 9.2 or
beyond.  Otherwise, I am afraid (as I am for all the big patches we
have in the works at this point) that we may end up with nothing.
That would be a real shame.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions and custom_variable_classes (was: Extensions, patch v20 (bitrot fixes))

From
Itagaki Takahiro
Date:
On Sun, Dec 19, 2010 at 23:45, Robert Haas <robertmhaas@gmail.com> wrote:
> I think I'd still argue for putting
> it in a separate patch.  I think that the value of extensions is first
> and foremost that they can simplify installing, removing, dumping, and
> restoring extensions.  I'd rather see us nail that, and then worry
> about custom_variable_classes as a separate issue, likely for 9.2 or
> beyond.

+1 to split the custom_variable_classes issue. It's a longstanding
TODO item, but EXTENSION is still very useful without the fix.

It's my guess that ExtensionSetCVC() can initialize modules one-by-one
and be called on demand, for example, at SHOW, searching pg_settings,
or assignment to variables with unknown classes.

--
Itagaki Takahiro


Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Now, for people following but not reading the patch, what's in is that
> in order for extensions using custom_variable_classes to work without
> the user having to care about it, I've added an step at backend startup
> time to seqscan pg_extension and update custom_variable_classes from
> this catalog.

I agree with Robert that that is an utterly horrid, broken concept.

Just to point out one concrete problem: the postmaster reads
postgresql.conf too, so it would have to do this as well in order to
parse postgresql.conf correctly.

Please remove it.  If you think of a non-broken way to do this later,
we can revisit the problem then.
        regards, tom lane


Re: Extensions, patch v20 (bitrot fixes)

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
>> Oh, I wasn't aware that Itagaki-san had objected to Tom's proposal.

> I agree that "the default encoding is UTF-8", but it should be
> configurable by the 'encoding' parameter in control files.

Why is it necessary to have such a parameter at all?  AFAICS it just
adds complexity for little if any gain.  Most extension files will
probably be pure ASCII anyway.  Dictionary files are *far* more likely
to contain non-ASCII characters.  If we've gotten along fine with
requiring dictionary files to be UTF8, I can't see any reason why we
can't or shouldn't take the same approach to extension files.

> So, I think no additional complexity will be added even if we
> support a  configurable encoding as the third encoding.

This is nonsense.  The mere existence of the parameter requires code
to support it and user documentation to explain it.
        regards, tom lane


On Sun, Dec 19, 2010 at 11:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I agree with Robert that that is an utterly horrid, broken concept.

That's a little stronger than what I said, though the same general idea.

> Just to point out one concrete problem: the postmaster reads
> postgresql.conf too, so it would have to do this as well in order to
> parse postgresql.conf correctly.

This is an even more serious problem than what I was pointing out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Just to point out one concrete problem: the postmaster reads
> postgresql.conf too, so it would have to do this as well in order to
> parse postgresql.conf correctly.

Well, if I parse you correctly, in my patch, it does. There's another
GUC array where to store invalid placeholders so that we can revisit
later.  That even allows custom_variable_classes not to be parsed first
in the list, even if that's not put into use in the patch.

> Please remove it.  If you think of a non-broken way to do this later,
> we can revisit the problem then.

Sure.  I'm curious about understanding how it's broken and if there's
any will at all to have something better (and what better would be), but
it's now obvious that it's not going to help the extension patch.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> +1 to split the custom_variable_classes issue. It's a longstanding
> TODO item, but EXTENSION is still very useful without the fix.
>
> It's my guess that ExtensionSetCVC() can initialize modules one-by-one
> and be called on demand, for example, at SHOW, searching pg_settings,
> or assignment to variables with unknown classes.

Just so that we're on the same line here, if we are to remove
custom_variable_classes then we don't keep any GUC related code into the
extension patch, right?  So that ExtensionSetCVC() and friends disappear
entirely.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v20 (bitrot fixes)

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Why is it necessary to have such a parameter at all?  AFAICS it just
> adds complexity for little if any gain.  Most extension files will
> probably be pure ASCII anyway.  Dictionary files are *far* more likely
> to contain non-ASCII characters.  If we've gotten along fine with
> requiring dictionary files to be UTF8, I can't see any reason why we
> can't or shouldn't take the same approach to extension files.

Don't forget that extensions are not just contrib or third party Open
Source software, but a lot of in-house code, mostly functions written in
SQL and PLpgSQL.  In non-English speaking countries, the parameter names
and comments are typically not written in English.

When we're talking Japan they have some quite specifics needs and I came
to understand that the database encoding and the script encoding are not
to be the same, usually.  So I still vote for handling this parameter.

>> So, I think no additional complexity will be added even if we
>> support a  configurable encoding as the third encoding.
>
> This is nonsense.  The mere existence of the parameter requires code
> to support it and user documentation to explain it.

The whole documentation needs to be:
    <varlistentry>     <term><replaceable class="parameter">encoding</replaceable></term>     <listitem>      <para>
  The <literal>encoding</literal> in which the script file is read.      </para>     </listitem>    </varlistentry>
 

The code to support that is on the order of 25 lines of code, once we
get rid of the SQL command level support for it.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions, patch v20 (bitrot fixes)

From
Itagaki Takahiro
Date:
On Mon, Dec 20, 2010 at 01:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I agree that "the default encoding is UTF-8", but it should be
>> configurable by the 'encoding' parameter in control files.
>
> Why is it necessary to have such a parameter at all?

UTF-8 is not a superset of all encodings.

-- 
Itagaki Takahiro


Re: Extensions and custom_variable_classes

From
Robert Haas
Date:
On Sun, Dec 19, 2010 at 1:35 PM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Just to point out one concrete problem: the postmaster reads
>> postgresql.conf too, so it would have to do this as well in order to
>> parse postgresql.conf correctly.
>
> Well, if I parse you correctly, in my patch, it does. There's another
> GUC array where to store invalid placeholders so that we can revisit
> later.  That even allows custom_variable_classes not to be parsed first
> in the list, even if that's not put into use in the patch.

I bet it doesn't.  The *postmaster* never connects to a database, so
which copy of pg_extension does it ever read?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions and custom_variable_classes

From
Itagaki Takahiro
Date:
On Mon, Dec 20, 2010 at 03:39, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Just so that we're on the same line here, if we are to remove
> custom_variable_classes then we don't keep any GUC related code into the
> extension patch, right?  So that ExtensionSetCVC() and friends disappear
> entirely.

I think so. It would be better to remove the CVC support and related code.

Preloading modules that defines CVC is a good direction to fix the issue,
but we need more consideration about where to do it.

--
Itagaki Takahiro


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I bet it doesn't.  The *postmaster* never connects to a database, so
> which copy of pg_extension does it ever read?

None, which does it need to read? My answer is none, you're saying it's
wrong, I don't get why. postmaster surely has no business with what's in
a specific database and no use at all of placeholder GUCs, right?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> I think so. It would be better to remove the CVC support and related code.

Will isolate that into another branch just in case and prepare a patch
with that removed.

> Preloading modules that defines CVC is a good direction to fix the issue,
> but we need more consideration about where to do it.

I checked for doing that too, but both shared_preload_libraries and
local_preload_libraries are managed way earlier than the point where we
can connect to a database a check what extensions it contains.  So that
appeared to me as a dead-end.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions and custom_variable_classes

From
Robert Haas
Date:
On Mon, Dec 20, 2010 at 3:56 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I bet it doesn't.  The *postmaster* never connects to a database, so
>> which copy of pg_extension does it ever read?
>
> None, which does it need to read? My answer is none, you're saying it's
> wrong, I don't get why. postmaster surely has no business with what's in
> a specific database and no use at all of placeholder GUCs, right?

Let's take a concrete example.  Suppose the user installs the
extension 'pg_stat_statements' and puts the following into their
postgresql.conf:

pg_stat_statements.max = 31415;

The effect of that has to be that the postmaster adds a certain amount
of space to PostgreSQL's initial shared memory allocation.  That means
the postmaster has to know that pg_stat_statements is a valid custom
variable class.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> The effect of that has to be that the postmaster adds a certain amount
> of space to PostgreSQL's initial shared memory allocation.  That means
> the postmaster has to know that pg_stat_statements is a valid custom
> variable class.

Ah. Yes. Indeed. So you still needed to edit postgresql.conf in such
cases. Well there's only 1 contrib module that touches SHM, but is also
happen to be the only one that needs custom_variable_classes support…

Meanwhile, the custom_variable_classes related code has been removed
from the extension's git branch, and the WITH ENCODING option is no more
(the control file encoding is still there and defaults to UTF-8).

Will continue collecting improvements ideas and cleanup needs as we go,
unless you prefer to see new patches (they are easy enough to produce).
 http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Extensions and custom_variable_classes

From
Robert Haas
Date:
On Mon, Dec 20, 2010 at 11:36 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> The effect of that has to be that the postmaster adds a certain amount
>> of space to PostgreSQL's initial shared memory allocation.  That means
>> the postmaster has to know that pg_stat_statements is a valid custom
>> variable class.
>
> Ah. Yes. Indeed. So you still needed to edit postgresql.conf in such
> cases. Well there's only 1 contrib module that touches SHM, but is also
> happen to be the only one that needs custom_variable_classes support…
>
> Meanwhile, the custom_variable_classes related code has been removed
> from the extension's git branch, and the WITH ENCODING option is no more
> (the control file encoding is still there and defaults to UTF-8).
>
> Will continue collecting improvements ideas and cleanup needs as we go,
> unless you prefer to see new patches (they are easy enough to produce).
>
>  http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=shortlog;h=refs/heads/extension

Patches are better for me, anyway...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Extensions and custom_variable_classes

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Patches are better for me, anyway...

Here it is then, version 21. Changes:

 - docs cleanup
 - downgrade a debug message from NOTICE to DEBUG1
 - get rid of custom_variable_classes
 - get rid of CREATE EXTENSION WITH ENCODING option
 - some context for WITH NO DATA option docs

The docs are updated in their HTML form too, if that's easier to read:

  http://pgsql.tapoueh.org/extensions/doc/html/extend-extension.html

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Attachment

Re: Extensions and custom_variable_classes

From
Alvaro Herrera
Date:
Excerpts from Dimitri Fontaine's message of lun dic 20 14:25:14 -0300 2010:
> Robert Haas <robertmhaas@gmail.com> writes:
> > Patches are better for me, anyway...
> 
> Here it is then, version 21. Changes:

Just noticed a small problem: you're removing the "SET search_path"
lines in contrib Makefiles but you're leaving the "Adjust this setting
to control where the objects get created" line behind, which should be
removed as well.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Extensions, patch v20 (bitrot fixes)

From
Martijn van Oosterhout
Date:
On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote:
> On Mon, Dec 20, 2010 at 01:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I agree that "the default encoding is UTF-8", but it should be
> >> configurable by the 'encoding' parameter in control files.
> >
> > Why is it necessary to have such a parameter at all?
>
> UTF-8 is not a superset of all encodings.

I think you mean Unicode is not a superset of all character sets. I've
heard this before but never found what's missing. [citation needed]?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: Extensions, patch v20 (bitrot fixes)

From
David Fetter
Date:
On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote:
> On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote:
> > On Mon, Dec 20, 2010 at 01:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> I agree that "the default encoding is UTF-8", but it should be
> > >> configurable by the 'encoding' parameter in control files.
> > >
> > > Why is it necessary to have such a parameter at all?
> > 
> > UTF-8 is not a superset of all encodings.
> 
> I think you mean Unicode is not a superset of all character sets. I've
> heard this before but never found what's missing. [citation needed]?

Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Extensions, patch v20 (bitrot fixes)

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote:
>> I think you mean Unicode is not a superset of all character sets. I've
>> heard this before but never found what's missing. [citation needed]?

> Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings.

[citation needed]?  Exactly what characters are missing, and why would
the Unicode people have chosen to leave them out?  It's not like they've
not heard of those encodings, I'm sure.
        regards, tom lane


Re: Extensions, patch v20 (bitrot fixes)

From
Kenneth Marshall
Date:
On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > On Mon, Dec 20, 2010 at 08:01:42PM +0100, Martijn van Oosterhout wrote:
> >> I think you mean Unicode is not a superset of all character sets. I've
> >> heard this before but never found what's missing. [citation needed]?
> 
> > Windows-1252, ISO-2022-JP-2 and EUC-TW are such encodings.
> 
> [citation needed]?  Exactly what characters are missing, and why would
> the Unicode people have chosen to leave them out?  It's not like they've
> not heard of those encodings, I'm sure.
> 
>             regards, tom lane
> 

Here is an interesting description of some of the gotchas:

http://en.wikipedia.org/wiki/Windows-1252

Regards,
Ken


Re: Extensions, patch v20 (bitrot fixes)

From
"David E. Wheeler"
Date:
On Dec 20, 2010, at 11:53 AM, Kenneth Marshall wrote:

> Here is an interesting description of some of the gotchas:
>
> http://en.wikipedia.org/wiki/Windows-1252

FWIW, those are gotchas translating between Windows 1252 and Latin-1. Windows 1252's nerbles translate to UTF-8 just
fine.

David



Re: Extensions, patch v20 (bitrot fixes)

From
Tom Lane
Date:
Kenneth Marshall <ktm@rice.edu> writes:
> On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote:
>> [citation needed]?  Exactly what characters are missing, and why would
>> the Unicode people have chosen to leave them out?  It's not like they've
>> not heard of those encodings, I'm sure.

> Here is an interesting description of some of the gotchas:
> http://en.wikipedia.org/wiki/Windows-1252

Well, it's interesting, but I see no glyphs on that page that lack
Unicode assignments.
        regards, tom lane


Re: Extensions, patch v20 (bitrot fixes)

From
Kenneth Marshall
Date:
On Mon, Dec 20, 2010 at 03:08:48PM -0500, Tom Lane wrote:
> Kenneth Marshall <ktm@rice.edu> writes:
> > On Mon, Dec 20, 2010 at 02:10:39PM -0500, Tom Lane wrote:
> >> [citation needed]?  Exactly what characters are missing, and why would
> >> the Unicode people have chosen to leave them out?  It's not like they've
> >> not heard of those encodings, I'm sure.
> 
> > Here is an interesting description of some of the gotchas:
> > http://en.wikipedia.org/wiki/Windows-1252
> 
> Well, it's interesting, but I see no glyphs on that page that lack
> Unicode assignments.
> 
>             regards, tom lane
> 
You are correct. I mis-read the text.

Regards,
Ken


Re: Extensions, patch v20 (bitrot fixes)

From
Nicolas Barbier
Date:
2010/12/20 Martijn van Oosterhout <kleptog@svana.org>:

> On Mon, Dec 20, 2010 at 09:03:56AM +0900, Itagaki Takahiro wrote:
>
>> UTF-8 is not a superset of all encodings.
>
> I think you mean Unicode is not a superset of all character sets. I've
> heard this before but never found what's missing. [citation needed]?

From <URL:http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>:

"Unicode is supposed to solve all encoding problems in all languages
of the world. [..] There are still controversies. For Japanese, the
kanji characters have been unified with Chinese, that is a character
considered to be the same in both Japanese and Chinese have been given
one and the same code number in Unicode, even if they look a little
different. This process, called Han unification, has caused
controversy."

For examples (my browser doesn't show any differences though, probably
because I don't have the corresponding fonts):

<URL:http://en.wikipedia.org/wiki/Han_unification#Examples_of_language_dependent_characters>

Nicolas


Re: Extensions, patch v20 (bitrot fixes)

From
Martijn van Oosterhout
Date:
On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote:
> >From <URL:http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>:
>
> "Unicode is supposed to solve all encoding problems in all languages
> of the world. [..] There are still controversies. For Japanese, the
> kanji characters have been unified with Chinese, that is a character
> considered to be the same in both Japanese and Chinese have been given
> one and the same code number in Unicode, even if they look a little
> different. This process, called Han unification, has caused
> controversy."

From http://en.wikipedia.org/wiki/CJK_Unified_Ideographs:

"However, the source separation rule states that characters encoded
separately in an earlier character set would remain separate in the new
Unicode encoding."

From all the references I've seen this has been applied everywhere and
any failures to roundtrip conversions are considered bugs and I can't
believe that at this point they havn't all been fixed. This is kind of
underscored by the fact that references always point to theoretical
problems rather than actual lists of characters that can't be
converted.

ISTM that since all the mapping tables are public it should be a SMOP
to *prove* roundtrip conversions are safe, or identify the problems.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
>                                       - Charles de Gaulle

Re: Extensions, patch v20 (bitrot fixes)

From
Itagaki Takahiro
Date:
On Tue, Dec 21, 2010 at 08:04, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Mon, Dec 20, 2010 at 10:15:56PM +0100, Nicolas Barbier wrote:
>> >From <URL:http://en.wikipedia.org/wiki/Japanese_language_and_computers#Character_encodings>:
> ISTM that since all the mapping tables are public it should be a SMOP
> to *prove* roundtrip conversions are safe, or identify the problems.

Another issue in Japanese users is EUDC (End User Defined Character).
Unfortunately for both postgres developers and application developers
in Japan, many machine dependence characters are still used in popular
mobile phones in Japan. Their native encoding is SHIFT_JIS, and we
have an EUDC mapping for SHIFT_JIS to/from EUC_JP. But we don't have
for UTF-8 to/from other encodings. That is one of the reasons why we
cannot move to the UTF-8 world completely.

Imagine that a module that manipulate EUDC text. It will be written
in EUC_JP because SHIFT_JIS is not supported in postgres. Also, it
cannot be rewritten in UTF-8 because there are no mapping for the
characters used in it.

-- 
Itagaki Takahiro