Thread: Updating column on row update

Updating column on row update

From
Thom Brown
Date:
Hi,

This should be simple, but for some reason I'm not quite sure what the solution is.  I want to be able to update the value of a column for rows that have been updated.  More specifically, if a row is updated, I want it's modified_date column to be populated with the current time stamp.  I've looked at triggers and rules, and it looks like I'd need to create a function just to achieve this which seems incredibly clumsy and unnecessary.  Could someone enlighten me?

Thanks

Thom

Re: Updating column on row update

From
Thom Brown
Date:
2009/11/22 Aaron Burnett <aburnett@bzzagent.com>

this is how I do it if this helps:

column_name timestamp without time zone NOT NULL DEFAULT ('now'::text)::timestamp(6) without time zone


Hi Aaron.  Thanks for the reply, but that would only insert the current date upon insertion into the table, not when the row is updated.

For example

CREATE TABLE timetest(
id SERIAL NOT NULL,
stuff text,
stamp timestamp NOT NULL DEFAULT now()
);

INSERT INTO timetest (stuff) VALUES ('meow');

The table would contain:

 id | stuff |           stamp            
----+-------+----------------------------
  1 | meow  | 2009-11-22 20:04:51.261739

But then I'd execute:

UPDATE timetest SET stuff = 'bark' WHERE id = 1;

 id | stuff |           stamp            
----+-------+----------------------------
  1 | bark  | 2009-11-22 20:04:51.261739

You can see the time hasn't changed.  But I'd want that stamp column to update to the current time without referring to that column in the update statement.

Thanks

Thom

Re: Updating column on row update

From
"Aaron Burnett"
Date:

this is how I do it if this helps:

column_name timestamp without time zone NOT NULL DEFAULT ('now'::text)::timestamp(6) without time zone



-----Original Message-----
From: pgsql-general-owner@postgresql.org on behalf of Thom Brown
Sent: Sun 11/22/2009 2:50 PM
To: PGSQL Mailing List
Subject: [GENERAL] Updating column on row update

Hi,

This should be simple, but for some reason I'm not quite sure what the
solution is.  I want to be able to update the value of a column for rows
that have been updated.  More specifically, if a row is updated, I want it's
modified_date column to be populated with the current time stamp.  I've
looked at triggers and rules, and it looks like I'd need to create a
function just to achieve this which seems incredibly clumsy and unnecessary.
 Could someone enlighten me?

Thanks

Thom

Re: Updating column on row update

From
Scott Marlowe
Date:
On Sun, Nov 22, 2009 at 12:50 PM, Thom Brown <thombrown@gmail.com> wrote:
> Hi,
> This should be simple, but for some reason I'm not quite sure what the
> solution is.  I want to be able to update the value of a column for rows
> that have been updated.  More specifically, if a row is updated, I want it's
> modified_date column to be populated with the current time stamp.  I've
> looked at triggers and rules, and it looks like I'd need to create a
> function just to achieve this which seems incredibly clumsy and unnecessary.
>  Could someone enlighten me?

Well, you DO have to create a function, but it's not all that clumsy
really.   Also it's quite flexible so you can do lots of complex stuff
and hide it away in a trigger function.

Example:

-- FUNCTION --

CREATE FUNCTION modtime () RETURNS opaque AS '
    BEGIN
        new.lm :=''now'';
        RETURN new;
    END;
' LANGUAGE 'plpgsql';

-- TABLE --

CREATE TABLE dtest (
    id int primary key,
    fluff text,
    lm timestamp without time zone
);


--TRIGGER --

CREATE TRIGGER dtest
  BEFORE UPDATE or INSERT ON dtest FOR EACH ROW EXECUTE PROCEDURE
    modtime(lm);

-- SQL TESTS --

INSERT INTO dtest (id, fluff) VALUES (1,'this is a test');
INSERT INTO dtest (id, fluff) VALUES (2,'this is another test');
SELECT * FROM dtest;
  1 | this is a test       | 2003-04-02 10:33:12.577089
  2 | this is another test | 2003-04-02 10:33:18.591148
UPDATE dtest SET id=3 WHERE id=1;
  3 | this is a test | 2003-04-02 10:34:52.219963  [1]
UPDATE dtest SET fluff='now is the time' WHERE id=2;
SELECT * FROM dtest WHERE id=2;
  2 | now is the time | 2003-04-02 10:38:06.259443 [2]
UPDATE dtest SET lm='2003-04-02 08:30:00' WHERE id=3;
SELECT * FROM dtest WHERE id=3;
  3 | this is a test | 2003-04-02 10:36:15.45687 [3]

[1] The timestamp has changed for this record when we changed the id field.
[2] The timestamp also changes for the fluff field.
[3] We tried to set lm, but the trigger on that field in dtest
intercepted the change and forced it

Re: Updating column on row update

From
Adrian Klaver
Date:
On Sunday 22 November 2009 12:09:04 pm Thom Brown wrote:
> 2009/11/22 Aaron Burnett <aburnett@bzzagent.com>
>
> > this is how I do it if this helps:
> >
> > column_name timestamp without time zone NOT NULL DEFAULT
> > ('now'::text)::timestamp(6) without time zone
> >
> > Hi Aaron.  Thanks for the reply, but that would only insert the current
>
> date upon insertion into the table, not when the row is updated.
>
> For example
>
> CREATE TABLE timetest(
> id SERIAL NOT NULL,
> stuff text,
> stamp timestamp NOT NULL DEFAULT now()
> );
>
> INSERT INTO timetest (stuff) VALUES ('meow');
>
> The table would contain:
>
>  id | stuff |           stamp
> ----+-------+----------------------------
>   1 | meow  | 2009-11-22 20:04:51.261739
>
> But then I'd execute:
>
> UPDATE timetest SET stuff = 'bark' WHERE id = 1;
>
>  id | stuff |           stamp
> ----+-------+----------------------------
>   1 | bark  | 2009-11-22 20:04:51.261739
>
> You can see the time hasn't changed.  But I'd want that stamp column to
> update to the current time without referring to that column in the update
> statement.
>
> Thanks
>
> Thom

You will need to use an UPDATE trigger with associated function.

--
Adrian Klaver
aklaver@comcast.net

Re: Updating column on row update

From
Thom Brown
Date:
2009/11/22 Scott Marlowe <scott.marlowe@gmail.com>
On Sun, Nov 22, 2009 at 12:50 PM, Thom Brown <thombrown@gmail.com> wrote:
> Hi,
> This should be simple, but for some reason I'm not quite sure what the
> solution is.  I want to be able to update the value of a column for rows
> that have been updated.  More specifically, if a row is updated, I want it's
> modified_date column to be populated with the current time stamp.  I've
> looked at triggers and rules, and it looks like I'd need to create a
> function just to achieve this which seems incredibly clumsy and unnecessary.
>  Could someone enlighten me?

Well, you DO have to create a function, but it's not all that clumsy
really.   Also it's quite flexible so you can do lots of complex stuff
and hide it away in a trigger function.

Example:

-- FUNCTION --

CREATE FUNCTION modtime () RETURNS opaque AS '
   BEGIN
       new.lm :=''now'';
       RETURN new;
   END;
' LANGUAGE 'plpgsql';

-- TABLE --

CREATE TABLE dtest (
   id int primary key,
   fluff text,
   lm timestamp without time zone
);


--TRIGGER --

CREATE TRIGGER dtest
 BEFORE UPDATE or INSERT ON dtest FOR EACH ROW EXECUTE PROCEDURE
   modtime(lm);

-- SQL TESTS --

INSERT INTO dtest (id, fluff) VALUES (1,'this is a test');
INSERT INTO dtest (id, fluff) VALUES (2,'this is another test');
SELECT * FROM dtest;
 1 | this is a test       | 2003-04-02 10:33:12.577089
 2 | this is another test | 2003-04-02 10:33:18.591148
UPDATE dtest SET id=3 WHERE id=1;
 3 | this is a test | 2003-04-02 10:34:52.219963  [1]
UPDATE dtest SET fluff='now is the time' WHERE id=2;
SELECT * FROM dtest WHERE id=2;
 2 | now is the time | 2003-04-02 10:38:06.259443 [2]
UPDATE dtest SET lm='2003-04-02 08:30:00' WHERE id=3;
SELECT * FROM dtest WHERE id=3;
 3 | this is a test | 2003-04-02 10:36:15.45687 [3]

[1] The timestamp has changed for this record when we changed the id field.
[2] The timestamp also changes for the fluff field.
[3] We tried to set lm, but the trigger on that field in dtest
intercepted the change and forced it

Thanks Scott.  It's a shame a function has to be used because it then has the dependency of plpgsql being loaded.  I'm attempting to write a database schema to accompany a PostgreSQL driver for a popular CMS, but I guess I could get it to load plpgsql in as a language.

The problem now is if the the schema creation script is run against a database where the language is already installed, I would get an error saying it already exists.  Is there a way to get it to check for it first, and only create it if it isn't exist?  Bear in mind I'd want this to be compatible at least as far back as 8.1.

Thanks

Thom

Re: Updating column on row update

From
Scott Marlowe
Date:
On Sun, Nov 22, 2009 at 1:32 PM, Thom Brown <thombrown@gmail.com> wrote:
> 2009/11/22 Scott Marlowe <scott.marlowe@gmail.com>
>>
>> On Sun, Nov 22, 2009 at 12:50 PM, Thom Brown <thombrown@gmail.com> wrote:
>> > Hi,
>> > This should be simple, but for some reason I'm not quite sure what the
>> > solution is.  I want to be able to update the value of a column for rows
>> > that have been updated.  More specifically, if a row is updated, I want
>> > it's
>> > modified_date column to be populated with the current time stamp.  I've
>> > looked at triggers and rules, and it looks like I'd need to create a
>> > function just to achieve this which seems incredibly clumsy and
>> > unnecessary.
>> >  Could someone enlighten me?
>>
>> Well, you DO have to create a function, but it's not all that clumsy
>> really.   Also it's quite flexible so you can do lots of complex stuff
>> and hide it away in a trigger function.
>>
>> Example:
>>
>> -- FUNCTION --
>>
>> CREATE FUNCTION modtime () RETURNS opaque AS '
>>    BEGIN
>>        new.lm :=''now'';
>>        RETURN new;
>>    END;
>> ' LANGUAGE 'plpgsql';
>>
>> -- TABLE --
>>
>> CREATE TABLE dtest (
>>    id int primary key,
>>    fluff text,
>>    lm timestamp without time zone
>> );
>>
>>
>> --TRIGGER --
>>
>> CREATE TRIGGER dtest
>>  BEFORE UPDATE or INSERT ON dtest FOR EACH ROW EXECUTE PROCEDURE
>>    modtime(lm);
>>
>> -- SQL TESTS --
>>
>> INSERT INTO dtest (id, fluff) VALUES (1,'this is a test');
>> INSERT INTO dtest (id, fluff) VALUES (2,'this is another test');
>> SELECT * FROM dtest;
>>  1 | this is a test       | 2003-04-02 10:33:12.577089
>>  2 | this is another test | 2003-04-02 10:33:18.591148
>> UPDATE dtest SET id=3 WHERE id=1;
>>  3 | this is a test | 2003-04-02 10:34:52.219963  [1]
>> UPDATE dtest SET fluff='now is the time' WHERE id=2;
>> SELECT * FROM dtest WHERE id=2;
>>  2 | now is the time | 2003-04-02 10:38:06.259443 [2]
>> UPDATE dtest SET lm='2003-04-02 08:30:00' WHERE id=3;
>> SELECT * FROM dtest WHERE id=3;
>>  3 | this is a test | 2003-04-02 10:36:15.45687 [3]
>>
>> [1] The timestamp has changed for this record when we changed the id
>> field.
>> [2] The timestamp also changes for the fluff field.
>> [3] We tried to set lm, but the trigger on that field in dtest
>> intercepted the change and forced it
>
> Thanks Scott.  It's a shame a function has to be used because it then has
> the dependency of plpgsql being loaded.  I'm attempting to write a database
> schema to accompany a PostgreSQL driver for a popular CMS, but I guess I
> could get it to load plpgsql in as a language.
> The problem now is if the the schema creation script is run against a
> database where the language is already installed, I would get an error
> saying it already exists.  Is there a way to get it to check for it first,
> and only create it if it isn't exist?  Bear in mind I'd want this to be
> compatible at least as far back as 8.1.

Try this:

select * from pg_language ;

Pretty sure that exists pretty far back.

Re: Updating column on row update

From
Scott Marlowe
Date:
On Sun, Nov 22, 2009 at 1:32 PM, Thom Brown <thombrown@gmail.com> wrote:
> Thanks Scott.  It's a shame a function has to be used because it then has
> the dependency of plpgsql being loaded.  I'm attempting to write a database
> schema to accompany a PostgreSQL driver for a popular CMS, but I guess I
> could get it to load plpgsql in as a language.

Nice thing about plpgsql is that as long as you own the database you
don't have to be a superuser to create language plgpsql.

Re: Updating column on row update

From
Thom Brown
Date:
2009/11/22 Scott Marlowe <scott.marlowe@gmail.com>
> Thanks Scott.  It's a shame a function has to be used because it then has
> the dependency of plpgsql being loaded.  I'm attempting to write a database
> schema to accompany a PostgreSQL driver for a popular CMS, but I guess I
> could get it to load plpgsql in as a language.
> The problem now is if the the schema creation script is run against a
> database where the language is already installed, I would get an error
> saying it already exists.  Is there a way to get it to check for it first,
> and only create it if it isn't exist?  Bear in mind I'd want this to be
> compatible at least as far back as 8.1.

Try this:

select * from pg_language ;

Pretty sure that exists pretty far back.

Yes, I noticed that existed in the catalogs, but how could that be incorporated into an installation SQL script?  The language constructs I imagine I'd need to test that are in plpgsql itself.

Thanks

Thom

Re: Updating column on row update

From
Adrian Klaver
Date:
On Sunday 22 November 2009 1:10:36 pm Thom Brown wrote:
> 2009/11/22 Scott Marlowe <scott.marlowe@gmail.com>
>
> > > Thanks Scott.  It's a shame a function has to be used because it then
> > > has the dependency of plpgsql being loaded.  I'm attempting to write a
> >
> > database
> >
> > > schema to accompany a PostgreSQL driver for a popular CMS, but I guess
> > > I could get it to load plpgsql in as a language.
> > > The problem now is if the the schema creation script is run against a
> > > database where the language is already installed, I would get an error
> > > saying it already exists.  Is there a way to get it to check for it
> >
> > first,
> >
> > > and only create it if it isn't exist?  Bear in mind I'd want this to be
> > > compatible at least as far back as 8.1.
> >
> > Try this:
> >
> > select * from pg_language ;
> >
> > Pretty sure that exists pretty far back.
>
> Yes, I noticed that existed in the catalogs, but how could that be
> incorporated into an installation SQL script?  The language constructs I
> imagine I'd need to test that are in plpgsql itself.
>
> Thanks
>
> Thom

As far as I know the language exists ERROR will not stop the rest of the process
from completing. You may in search of solution to a problem that does not
exist :)

--
Adrian Klaver
aklaver@comcast.net

Re: Updating column on row update

From
Christophe Pettus
Date:
David Fetter and Andreas Scherbaum also have solutions for this in
reployment scripts:

    http://people.planetpostgresql.org/dfetter/index.php?/archives/23-CREATE-OR-REPLACE-LANGUAGE.html
    http://andreas.scherbaum.la/blog/archives/346-create-language-if-not-exist.html
--
-- Christophe Pettus
    xof@thebuild.com


Re: Updating column on row update

From
Thom Brown
Date:
2009/11/22 Christophe Pettus <xof@thebuild.com>
David Fetter and Andreas Scherbaum also have solutions for this in reployment scripts:

       http://people.planetpostgresql.org/dfetter/index.php?/archives/23-CREATE-OR-REPLACE-LANGUAGE.html
       http://andreas.scherbaum.la/blog/archives/346-create-language-if-not-exist.html
--
-- Christophe Pettus
  xof@thebuild.com


Ah, I think that will work.  Thanks :)

Thom 

Re: Updating column on row update

From
Craig Ringer
Date:
On 23/11/2009 4:15 AM, Scott Marlowe wrote:
> On Sun, Nov 22, 2009 at 12:50 PM, Thom Brown <thombrown@gmail.com> wrote:
>> Hi,
>> This should be simple, but for some reason I'm not quite sure what the
>> solution is.  I want to be able to update the value of a column for rows
>> that have been updated.  More specifically, if a row is updated, I want it's
>> modified_date column to be populated with the current time stamp.  I've
>> looked at triggers and rules, and it looks like I'd need to create a
>> function just to achieve this which seems incredibly clumsy and unnecessary.
>>  Could someone enlighten me?
>
> Well, you DO have to create a function, but it's not all that clumsy
> really.   Also it's quite flexible so you can do lots of complex stuff
> and hide it away in a trigger function.

I do think this comes up often enough that a built-in trigger "update
named column with result of expression on insert" trigger might be
desirable. Especially if implemented in C to avoid the need for PL/PgSQL
and to reduce the CPU cost a smidge.

Hmm. CC'iing -hackers; there was a discussion earlier on it being
desirable to have more "[EASY]" TODO items, and this might be a good one
for the job.

So might "CREATE LANGUAGE ... IF NOT EXISTS". Maybe even "CREATE ROLE
... IF NOT EXISTS" and "CREATE USER ... IF NOT EXISTS" - I know I'd find
them really handy.

--
Craig Ringer

Re: Updating column on row update

From
Thom Brown
Date:
2009/11/22 Craig Ringer <craig@postnewspapers.com.au>
On 23/11/2009 4:15 AM, Scott Marlowe wrote:
> On Sun, Nov 22, 2009 at 12:50 PM, Thom Brown <thombrown@gmail.com> wrote:
>> Hi,
>> This should be simple, but for some reason I'm not quite sure what the
>> solution is.  I want to be able to update the value of a column for rows
>> that have been updated.  More specifically, if a row is updated, I want it's
>> modified_date column to be populated with the current time stamp.  I've
>> looked at triggers and rules, and it looks like I'd need to create a
>> function just to achieve this which seems incredibly clumsy and unnecessary.
>>  Could someone enlighten me?
>
> Well, you DO have to create a function, but it's not all that clumsy
> really.   Also it's quite flexible so you can do lots of complex stuff
> and hide it away in a trigger function.

I do think this comes up often enough that a built-in trigger "update
named column with result of expression on insert" trigger might be
desirable. Especially if implemented in C to avoid the need for PL/PgSQL
and to reduce the CPU cost a smidge.

Hmm. CC'iing -hackers; there was a discussion earlier on it being
desirable to have more "[EASY]" TODO items, and this might be a good one
for the job.

So might "CREATE LANGUAGE ... IF NOT EXISTS". Maybe even "CREATE ROLE
... IF NOT EXISTS" and "CREATE USER ... IF NOT EXISTS" - I know I'd find
them really handy.

--
Craig Ringer

I would have thought the IF NOT EXISTS syntax could be handy on every CREATE command and I wouldn't object to such a thing being implemented in future.

But my reason for the column updating on row update was due to me converting a MySQL script to PostgreSQL.  MySQL had the following syntax available:

`updated_date` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP

That's effectively what I'm emulating.  I think that syntax is actually quite useful.  Another thing I found useful was having COMMENT available on the same line as the column declaration too.  An example of this is: 

`parent_id` integer unsigned NOT NULL default '0' COMMENT 'The parent menu item in the menu tree.',

I really can't think of any other syntactic sugar I'd want from MySQL though.

Thom

Re: Updating column on row update

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> I do think this comes up often enough that a built-in trigger "update
> named column with result of expression on insert" trigger might be
> desirable.

There's something of the sort in contrib already, I believe, though
it's so old it still uses abstime :-(

> So might "CREATE LANGUAGE ... IF NOT EXISTS". Maybe even "CREATE ROLE
> ... IF NOT EXISTS" and "CREATE USER ... IF NOT EXISTS" - I know I'd find
> them really handy.

CREATE IF NOT EXISTS has been proposed and rejected before, more than
once.  Please see the archives.

            regards, tom lane

Re: Updating column on row update

From
silly8888
Date:
> MySQL had the following syntax available:
> `updated_date` timestamp NOT NULL default CURRENT_TIMESTAMP on update
> CURRENT_TIMESTAMP

I wonder supporting this syntax would speed things up a little bit.
Here's a simple benchmark about the situation we are discussing here:

There are 2 tables:
      CREATE TABLE t1 (n integer not null, mtime timestamp with time
zone not null);
      CREATE TABLE t2 (n integer not null, mtime timestamp with time
zone not null);

and a trigger for the second one:
      CREATE LANGUAGE plpgsql;
      CREATE FUNCTION touch() RETURNS trigger AS $$
         BEGIN
             new.mtime := now();
             RETURN new;
         END;
      $$ LANGUAGE 'plpgsql';
      CREATE TRIGGER ttt_mtime BEFORE UPDATE or INSERT
            ON t2 FOR EACH ROW EXECUTE PROCEDURE touch();

and here's the actual test:

test=> INSERT INTO t1(n,mtime) SELECT *, now() FROM generate_series(1,1000000);
INSERT 0 1000000
Time: 7382.313 ms
test=> INSERT INTO t2(n) SELECT * FROM generate_series(1,1000000);
INSERT 0 1000000
Time: 24541.088 ms

So, updating the column explicitly is 3.5 times faster than the
trigger. My guess is that in real life applications where tables have
"bigger" rows (more columns, data types other than integer), the
overhead of the trigger will be even smaller.

Re: [HACKERS] Updating column on row update

From
Robert Haas
Date:
On Sun, Nov 22, 2009 at 6:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig@postnewspapers.com.au> writes:
>> I do think this comes up often enough that a built-in trigger "update
>> named column with result of expression on insert" trigger might be
>> desirable.
>
> There's something of the sort in contrib already, I believe, though
> it's so old it still uses abstime :-(
>
>> So might "CREATE LANGUAGE ... IF NOT EXISTS". Maybe even "CREATE ROLE
>> ... IF NOT EXISTS" and "CREATE USER ... IF NOT EXISTS" - I know I'd find
>> them really handy.
>
> CREATE IF NOT EXISTS has been proposed and rejected before, more than
> once.  Please see the archives.

Search for CINE to find the discussions.  This is a good place to start:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg00252.php

Despite Tom's assertions to the contrary, I am unable to find a clear
consensus against this feature in the archives, and still think it
would be useful.  MySQL and SQLite both support it, at least in the
specific case of CREATE TABLE IF NOT EXISTS.  But I've exhausted my
quota of beating my head against a brick wall on this issue.

...Robert

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Nov 22, 2009 at 6:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> CREATE IF NOT EXISTS has been proposed and rejected before, more than
>> once. �Please see the archives.

> Search for CINE to find the discussions.  This is a good place to start:
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg00252.php

> Despite Tom's assertions to the contrary, I am unable to find a clear
> consensus against this feature in the archives,

I think you didn't look back far enough --- that issue was settled years
ago.  IIRC the killer argument is that after CINE you do not know the
state of the object: it exists, yes, but what properties has it got?
If it already existed then it's still got its old definition, which
might or might not be what you're expecting.

CREATE OR REPLACE has got far safer semantics from the viewpoint of a
script that wants to bull through without having any actual error
handling (which is more or less the scenario we are arguing here, no?)
After successful execution of the command you know exactly what
properties the object has got.

Whether it would be sensible to have CREATE OR REPLACE semantics for a
language is something I'm not very clear on.  It seems like changing any
of the properties of a pg_language entry could be rather fatal from the
viewpoint of an existing function using the language.

[ thinks for awhile... ]  Actually, CREATE LANGUAGE is unique among
creation commands in that the common cases have no parameters, at least
not since we added pg_pltemplate.  So you could imagine defining CINE
for a language as disallowing any parameters and having these semantics:
    * language not present -> create from template
    * language present, matches template -> OK, do nothing
    * language present, does not match template -> report error
This would meet the objection of not being sure what the state is
after successful execution of the command.  It doesn't scale to any
other object type, but is it worth doing for this one type?

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Andrew Dunstan
Date:

Tom Lane wrote:
> [ thinks for awhile... ]  Actually, CREATE LANGUAGE is unique among
> creation commands in that the common cases have no parameters, at least
> not since we added pg_pltemplate.  So you could imagine defining CINE
> for a language as disallowing any parameters and having these semantics:
>     * language not present -> create from template
>     * language present, matches template -> OK, do nothing
>     * language present, does not match template -> report error
> This would meet the objection of not being sure what the state is
> after successful execution of the command.  It doesn't scale to any
> other object type, but is it worth doing for this one type?
>
>



I seriously doubt it. The only reason I could see for such a thing would
be to make it orthogonal with other CINE commands.

Part of the motivation for allowing inline blocks was to allow for
conditional logic. So you can do things like:

    DO $$

    begin
        if not exists (select 1 from pg_tables where schemaname = 'foo'
    and tablename = 'bar') then
           create table foo.bar (x int, y text);
        end if;
    end;

    $$;


It's a bit more verbose (maybe someone can streamline it) but it does
give you CINE (for whatever flavor of CINE you want), as well as lots
more complex possibilities than we can conceivably build into SQL.

cheers

andrew


Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Part of the motivation for allowing inline blocks was to allow for
> conditional logic.

I don't think that argument really applies to this case, because the
complaint was about not being sure if plpgsql is installed.  If it
isn't, you can hardly use a plpgsql DO block to fix it.

(Is anyone up for revisiting the perennial topic of whether to install
plpgsql by default?  Andrew's argument does suggest that DO might offer
a new consideration in that tradeoff.)

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Scott Marlowe
Date:
On Sun, Nov 22, 2009 at 10:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Part of the motivation for allowing inline blocks was to allow for
>> conditional logic.
>
> I don't think that argument really applies to this case, because the
> complaint was about not being sure if plpgsql is installed.  If it
> isn't, you can hardly use a plpgsql DO block to fix it.
>
> (Is anyone up for revisiting the perennial topic of whether to install
> plpgsql by default?  Andrew's argument does suggest that DO might offer
> a new consideration in that tradeoff.)

One non-coding vote for yes.

Re: [HACKERS] Updating column on row update

From
Craig Ringer
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Part of the motivation for allowing inline blocks was to allow for
>> conditional logic.
>
> I don't think that argument really applies to this case, because the
> complaint was about not being sure if plpgsql is installed.  If it
> isn't, you can hardly use a plpgsql DO block to fix it.
>
> (Is anyone up for revisiting the perennial topic of whether to install
> plpgsql by default?  Andrew's argument does suggest that DO might offer
> a new consideration in that tradeoff.)

It'd be a HUGE benefit in deployment and update scripts to have PL/PgSQL
 installed and available by default, at least to the superuser and to
the DB owner.

One issue I run into with DB deployment is that a schema often requires
several roles. If the schema has been imported into another (possibly
since-dropped) database in the cluster before, global changes such as
role creations will fail since they've already been done by a prior run.
This makes it necessary to split the schema into global and
database-specific parts or to ignore errors that arise as the SQL is
processed. Neither option lets me reasonably apply a schema update
transactionally. Having PL/PgSQL available right from an initial
connection to `template1' as superuser for a 'CREATE DATABASE' would be
great, as I could use appropriate logic to avoid or handle errors, and
could run schema create/update scripts with ON_ERROR_ROLLBACK .

If CREATE LANGUAGE silently succeeded if the language was already
installed with the same params, perhaps PL/PgSQL could be made available
by default with no impact on existing scripts and apps? Is there any
harm in making it succeed if it need take no action to achieve the
requested state? After all, the end result is as the user requested. Do
they really care whether CREATE LANGUAGE had to  modify the catalogs?


As for CREATE [USER|ROLE] ... IF NOT EXISTS; I was concerned about just
that issue, which is why I was unsure whether it was sane for users and
roles. Being able to easily test for the presence of a user (say, within
a DO block with default-installed PL/PgSQL) would be nicer and safer
than having ... IF EXISTS for users/roles.

--
Craig Ringer

Re: [HACKERS] Updating column on row update

From
Scott Marlowe
Date:
On Sun, Nov 22, 2009 at 10:41 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> Tom Lane wrote:
> It'd be a HUGE benefit in deployment and update scripts to have PL/PgSQL
>  installed and available by default, at least to the superuser and to
> the DB owner.

Are there any known security problems with plpgsql?

Re: [HACKERS] Updating column on row update

From
Thom Brown
Date:
2009/11/23 Tom Lane <tgl@sss.pgh.pa.us>
CREATE OR REPLACE has got far safer semantics from the viewpoint of a
script that wants to bull through without having any actual error
handling (which is more or less the scenario we are arguing here, no?)
After successful execution of the command you know exactly what
properties the object has got.

Whether it would be sensible to have CREATE OR REPLACE semantics for a
language is something I'm not very clear on.  It seems like changing any
of the properties of a pg_language entry could be rather fatal from the
viewpoint of an existing function using the language.

[ thinks for awhile... ]  Actually, CREATE LANGUAGE is unique among
creation commands in that the common cases have no parameters, at least
not since we added pg_pltemplate.  So you could imagine defining CINE
for a language as disallowing any parameters and having these semantics:
       * language not present -> create from template
       * language present, matches template -> OK, do nothing
       * language present, does not match template -> report error
This would meet the objection of not being sure what the state is
after successful execution of the command.  It doesn't scale to any
other object type, but is it worth doing for this one type?

                       regards, tom lane

Actually, I prefer CREATE OR REPLACE over CINE, at least for the majority of the creations, especially since it would be more consistent with what we have for functions.  If there must be an exception for languages, it would make sense from what you describe above.

As for having plpgsql installed by default, are there any security implications?  If not, I can only see it as an advantage.  At the moment we're having to resort to a bit of a hack using a CASE statement in a plain SQL function as mentioned earlier in this thread.

Thom

Re: [HACKERS] Updating column on row update

From
Robert Haas
Date:
On Sun, Nov 22, 2009 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Nov 22, 2009 at 6:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> CREATE IF NOT EXISTS has been proposed and rejected before, more than
>>> once.  Please see the archives.
>
>> Search for CINE to find the discussions.  This is a good place to start:
>> http://archives.postgresql.org/pgsql-hackers/2009-05/msg00252.php
>
>> Despite Tom's assertions to the contrary, I am unable to find a clear
>> consensus against this feature in the archives,
>
> I think you didn't look back far enough --- that issue was settled years
> ago.  IIRC the killer argument is that after CINE you do not know the
> state of the object: it exists, yes, but what properties has it got?
> If it already existed then it's still got its old definition, which
> might or might not be what you're expecting.
>
> CREATE OR REPLACE has got far safer semantics from the viewpoint of a
> script that wants to bull through without having any actual error
> handling (which is more or less the scenario we are arguing here, no?)
> After successful execution of the command you know exactly what
> properties the object has got.

Sure.  I think that CINE only makes sense for objects for which COR
can't be implemented - things that have internal substructure, like
tables or tablespaces.  I agree that there are pitfalls for the unwary
but I still think it's better than nothing.  I understand that you
disagree.

> Whether it would be sensible to have CREATE OR REPLACE semantics for a
> language is something I'm not very clear on.  It seems like changing any
> of the properties of a pg_language entry could be rather fatal from the
> viewpoint of an existing function using the language.
>
> [ thinks for awhile... ]  Actually, CREATE LANGUAGE is unique among
> creation commands in that the common cases have no parameters, at least
> not since we added pg_pltemplate.  So you could imagine defining CINE
> for a language as disallowing any parameters and having these semantics:
>        * language not present -> create from template
>        * language present, matches template -> OK, do nothing
>        * language present, does not match template -> report error
> This would meet the objection of not being sure what the state is
> after successful execution of the command.  It doesn't scale to any
> other object type, but is it worth doing for this one type?

CREATE OR REPLACE seems like a better fit in this case.  For example,
it seems plausible that someone might want to add an inline handler to
a procedural language that didn't have one without dropping and
recreating the language.  Even changing the call handler seems like it
could be potentially useful in an upgrade scenario.

...Robert

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Thom Brown <thombrown@gmail.com> writes:
> As for having plpgsql installed by default, are there any security
> implications?

Well, that's pretty much exactly the question --- are there?  It would
certainly make it easier for someone to exploit any other security
weakness they might find.  I believe plain SQL plus SQL functions is
Turing-complete, but that doesn't mean it's easy or fast to write loops
etc in it.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Thom Brown
Date:
2009/11/23 Tom Lane <tgl@sss.pgh.pa.us>
Thom Brown <thombrown@gmail.com> writes:
> As for having plpgsql installed by default, are there any security
> implications?

Well, that's pretty much exactly the question --- are there?  It would
certainly make it easier for someone to exploit any other security
weakness they might find.  I believe plain SQL plus SQL functions is
Turing-complete, but that doesn't mean it's easy or fast to write loops
etc in it.

                       regards, tom lane

I personally find it more important to gracefully add plpgsql if it doesn't already exist than to rely on it already being there.  In a way it wouldn't solve this problem as someone could have still removed it.  Other procedural languages could benefit from some sort of check too.

Thom

Re: [HACKERS] Updating column on row update

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 > Thom Brown <thombrown@gmail.com> writes:
 >> As for having plpgsql installed by default, are there any security
 >> implications?

 Tom> Well, that's pretty much exactly the question --- are there?  It
 Tom> would certainly make it easier for someone to exploit any other
 Tom> security weakness they might find.  I believe plain SQL plus SQL
 Tom> functions is Turing-complete, but that doesn't mean it's easy or
 Tom> fast to write loops etc in it.

Now that we have recursive CTEs, plain SQL is turing-complete without
requiring functions.

(Yes, I did actually prove this a while back, by implementing one of
the known-Turing-complete tag system automata as a single recursive
query. This proof is pretty boring, though, because you wouldn't
actually _use_ that approach in practice.)

Loops in plain SQL are no problem: see generate_series. The last time
we discussed this I demonstrated reasonably straightforward SQL
examples of how to do things like password-cracking (and that was long
before we had CTEs, so it would be even easier now); my challenge to
anyone to produce examples of malicious plpgsql code that couldn't be
reproduced in plain SQL went unanswered.

--
Andrew (irc:RhodiumToad)

Re: [HACKERS] Updating column on row update

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Thom Brown <thombrown@gmail.com> writes:
>
>> As for having plpgsql installed by default, are there any security
>> implications?
>>
>
> Well, that's pretty much exactly the question --- are there?  It would
> certainly make it easier for someone to exploit any other security
> weakness they might find.  I believe plain SQL plus SQL functions is
> Turing-complete, but that doesn't mean it's easy or fast to write loops
> etc in it.
>
>
>

That's a bit harder argument to sustain now we have recursive queries, ISTM.

cheers

andrew

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Well, that's pretty much exactly the question --- are there?  It
>  Tom> would certainly make it easier for someone to exploit any other
>  Tom> security weakness they might find.

> Loops in plain SQL are no problem: see generate_series. The last time
> we discussed this I demonstrated reasonably straightforward SQL
> examples of how to do things like password-cracking (and that was long
> before we had CTEs, so it would be even easier now); my challenge to
> anyone to produce examples of malicious plpgsql code that couldn't be
> reproduced in plain SQL went unanswered.

The fact remains though that the looping performance of anything you can
cons up in straight SQL will be an order of magnitude worse than in
plpgsql; and it's a notation the average script kiddie will find pretty
unfamiliar.  So I think this still does represent some barrier.  Whether
it's enough of a barrier to justify keeping plpgsql out of the default
install is certainly debatable.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> Loops in plain SQL are no problem: see generate_series. The last
 >> time we discussed this I demonstrated reasonably straightforward
 >> SQL examples of how to do things like password-cracking (and that
 >> was long before we had CTEs, so it would be even easier now); my
 >> challenge to anyone to produce examples of malicious plpgsql code
 >> that couldn't be reproduced in plain SQL went unanswered.

 Tom> The fact remains though that the looping performance of anything
 Tom> you can cons up in straight SQL will be an order of magnitude
 Tom> worse than in plpgsql;

Well, let's see. How about generating all possible strings of 6 characters
from A-Z? We'll just count the results for now:

select count(chr(65+(i/676))||chr(65+(i/26)%26)||chr(65+i%26)
             ||chr(65+(j/676))||chr(65+(j/26)%26)||chr(65+j%26))
  from generate_series(0,17575) i, generate_series(0,17575) j;
   count
-----------
 308915776
(1 row)

Time: 462570.563 ms

create function foo() returns bigint language plpgsql
 as $f$
  declare
    c bigint := 0;
    s text;
  begin
    for i in 0..17575 loop
      for j in 0..17575 loop
        s := chr(65+(i/676))||chr(65+(i/26)%26)||chr(65+i%26)
             ||chr(65+(j/676))||chr(65+(j/26)%26)||chr(65+j%26);
        c := c + 1;
      end loop;
    end loop;
    return c;
  end;
$f$;

select foo();
    foo
-----------
 308915776
(1 row)

Time: 624809.671 ms

plpgsql comes out 35% _slower_, not "an order of magnitude worse".

--
Andrew.

Re: [HACKERS] Updating column on row update

From
Craig Ringer
Date:
On 23/11/2009 11:35 PM, Tom Lane wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>>  Tom> Well, that's pretty much exactly the question --- are there?  It
>>  Tom> would certainly make it easier for someone to exploit any other
>>  Tom> security weakness they might find.
>
>> Loops in plain SQL are no problem: see generate_series. The last time
>> we discussed this I demonstrated reasonably straightforward SQL
>> examples of how to do things like password-cracking (and that was long
>> before we had CTEs, so it would be even easier now); my challenge to
>> anyone to produce examples of malicious plpgsql code that couldn't be
>> reproduced in plain SQL went unanswered.
>
> The fact remains though that the looping performance of anything you can
> cons up in straight SQL will be an order of magnitude worse than in
> plpgsql; and it's a notation the average script kiddie will find pretty
> unfamiliar.  So I think this still does represent some barrier.  Whether
> it's enough of a barrier to justify keeping plpgsql out of the default
> install is certainly debatable.

"the average script kiddie" doesn't write their own exploits. They tend
to use proofs of concept created by very, very smart people involved in
active security research (be it malicious or not) that've been wrapped
up in easy-to-use "click to exploit" tools. They'll no more learn SQL to
explot Pg than they learn x86 asm to write their payload injectors, or
learn details about the Linux kernel to use a local root exploit.

Just making it a bit harder doesn't stop determined attackers, such as
security researchers, criminals seeking confidential information (credit
card databases etc) or commercially-motivated black hats seeking to
build botnets. Once the determined attackers find a way, the script
kiddies and the botnet builders tend to follow.

Any attack relying on the presence of PL/PgSQL will have to be an attack
by an already-authenticated user* with an established connection. Mass
botnet- or worm- style exploits are out. That said, PL/PgSQL does
undeniably increase the available attack surface for a rogue authorized
user, simply by being more code that has to be free of security issues.
An authenticated user might seek to escalate to DB superuser priveleges,
gain access to other DBs, or execute code as the "postgres" user to
trigger a local root exploit on the hosting machine. So there is some
concern, since not all authenticated users are necessarily fully trusted.

It's for that reason that I proposed it being made available by default
only to superusers and to the owner of the current database. If either
of those are executing malicious code, you're already well and truly
screwed. Thinking about it some more, though, there's nothing that'll
let the DB owner (unlike the superuser) execute supplied code as the
"postgres" user or break into other DBs, so an exploit against PL/PgSQL
might still give them a way to escalate priveleges.

I don't see making PL/PgSQL available by default only to a superuser as
particularly useful. Anything else does increase the attack surface
available for potential exploit. So the question becomes "is that
increase a sufficient worry to block the installation of PL/PgSQL by
default" ?

Personally, I think not. Default installing PL/PgSQL doesn't increase
the risk of a worm, which would be my main worry about enabling a
feature by default. PL/PgSQL is so widely used that any security issue
with it is already about as critical as it can be. Those few with
significant databases who do NOT use it can drop it if they are
concerned, but I sincerely doubt there are many significant production
DBs out there without it installed and in use, as it's effectively a
requirement for the use of triggers.

So - I say go ahead and install it by default, available to all users.
It's meant to be a trusted language, it's basically required for
triggers, it's nearly universal anyway, and it's a pain not having it
installed.

If it's not to be installed by default, then a cleaner way to ensure
it's installed it would be strongly desirable.



(I'm also honestly not sure what relevance performance has here. If it
takes an attacker 10 minutes to exploit a server rather than 1 minute,
is it any less cracked? Performance is only an issue if it renders an
attack impossible due to memory/storage requirements, or non-linear
computation time growth. Anyway, I frequently seek to avoid PL/PgSQL,
using pure SQL loops etc instead, because it's *faster* that way.)

* It could, I guess, be a hijacked connection, but if you have
connection hijacking going on you've already lost and have bigger things
to worry about. Otherwise it's going to be a stolen username/password,
or an authorized user gone rogue.

Re: [HACKERS] Updating column on row update

From
Hannu Krosing
Date:
On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
> Craig Ringer <craig@postnewspapers.com.au> writes:
> > I do think this comes up often enough that a built-in trigger "update
> > named column with result of expression on insert" trigger might be
> > desirable.
>
> There's something of the sort in contrib already, I believe, though
> it's so old it still uses abstime :-(

What's wrong with abstime ?

it is valid for timestamps up to 2038-01-19 and it's on-disk size
smaller than other timestamp options


--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training



Re: [HACKERS] Updating column on row update

From
Thom Brown
Date:
2009/11/24 Hannu Krosing <hannu@2ndquadrant.com>
On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
> Craig Ringer <craig@postnewspapers.com.au> writes:
> > I do think this comes up often enough that a built-in trigger "update
> > named column with result of expression on insert" trigger might be
> > desirable.
>
> There's something of the sort in contrib already, I believe, though
> it's so old it still uses abstime :-(

What's wrong with abstime ?

it is valid for timestamps up to 2038-01-19 and it's on-disk size
smaller than other timestamp options


But it's very very deprecated and could be removed at any time.  It's been so for years now, and I wouldn't want to *start* using something which is deprecated.

Thom

Re: [HACKERS] Updating column on row update

From
Hannu Krosing
Date:
On Tue, 2009-11-24 at 09:46 +0000, Thom Brown wrote:
> 2009/11/24 Hannu Krosing <hannu@2ndquadrant.com>
>         On Sun, 2009-11-22 at 18:51 -0500, Tom Lane wrote:
>         > Craig Ringer <craig@postnewspapers.com.au> writes:
>         > > I do think this comes up often enough that a built-in
>         trigger "update
>         > > named column with result of expression on insert" trigger
>         might be
>         > > desirable.
>         >
>         > There's something of the sort in contrib already, I believe,
>         though
>         > it's so old it still uses abstime :-(
>
>
>         What's wrong with abstime ?
>
>         it is valid for timestamps up to 2038-01-19 and it's on-disk
>         size
>         smaller than other timestamp options
>
>
> But it's very very deprecated and could be removed at any time.  It's
> been so for years now, and I wouldn't want to *start* using something
> which is deprecated.
>
> Thom

I'd expect it to have an afterlife as a separately maintained type
somewhere for those who care about data sizes, similar other space
savers like ip4 type.


--
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability
   Services, Consulting and Training



Re: [HACKERS] Updating column on row update

From
"Kevin Grittner"
Date:
Andrew Dunstan <andrew@dunslane.net> wrote:

> Part of the motivation for allowing inline blocks was to allow for
> conditional logic. So you can do things like:
>
>   DO $$
>
>   begin
>       if not exists (select 1 from pg_tables
>                      where schemaname = 'foo'
>                      and tablename = 'bar') then
>            create table foo.bar (x int, y text);
>       end if;
>   end;
>
>   $$;
>
>
> It's a bit more verbose (maybe someone can streamline it) but it
> does give you CINE (for whatever flavor of CINE you want), as well
> as lots more complex possibilities than we can conceivably build
> into SQL.

So we're conceding that this is a valid need and people will now have
a way to meet it.  Is the argument against having CINE syntax that it
would be more prone to error than the above, or that the code would be
so large and complex as to create a maintenance burden?  (Is there
some other reason I'm missing?)

-Kevin

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> So we're conceding that this is a valid need and people will now have
> a way to meet it.  Is the argument against having CINE syntax that it
> would be more prone to error than the above, or that the code would be
> so large and complex as to create a maintenance burden?

The argument against CINE is that it's unsafe.  The fragment proposed
by Andrew is no safer, of course, but it could be made safe by adding
additional checks that the properties of the existing object are what
the script expects.  So in principle that's an acceptable approach,
whereas CINE will never be safe.

But actually I thought we had more or less concluded that CREATE OR
REPLACE LANGUAGE would be acceptable (perhaps only if it's given
without any extra args?).  Or for that matter there seems to be enough
opinion on the side of just installing plpgsql by default.  CINE is
a markedly inferior alternative to either of those.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> The argument against CINE is that it's unsafe.

By no means rhetorically, is that based on the assumption that the
statement would not validate that the existing object (if any) matches
the supplied definition?

> The fragment proposed by Andrew is no safer, of course, but it could
> be made safe by adding additional checks that the properties of the
> existing object are what the script expects.

Again, not rhetorically, is that assuming an error-free mapping of the
CREATE statement to all the related system tables -- each time it is
written by every user, individually?

> So in principle that's an acceptable approach,
> whereas CINE will never be safe.

Only with the most simplistic implementation of CINE.  I really don't
see how that assertion holds up if there is checking of the supplied
definition against the existing object.  Even the most simplistic
definition is arguably safer than CREATE OR REPLACE, since that can
destroy existing data.  An implementation which does the checking that
you suggest, reviewed by this community to confirm that it is correct,
would seem to beat out most people's home-grown attempts to write what
you suggest.

> But actually I thought we had more or less concluded that CREATE OR
> REPLACE LANGUAGE would be acceptable (perhaps only if it's given
> without any extra args?).  Or for that matter there seems to be
> enough opinion on the side of just installing plpgsql by default.
> CINE is a markedly inferior alternative to either of those.

It sounded pretty much like a consensus on installing by default to
me; however, that doesn't seem like it has anything to do with
Andrew's example or my reply to it.

-Kevin

Re: [HACKERS] Updating column on row update

From
Robert Haas
Date:
On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> So we're conceding that this is a valid need and people will now have
>> a way to meet it.  Is the argument against having CINE syntax that it
>> would be more prone to error than the above, or that the code would be
>> so large and complex as to create a maintenance burden?
>
> The argument against CINE is that it's unsafe.  The fragment proposed
> by Andrew is no safer, of course, but it could be made safe by adding
> additional checks that the properties of the existing object are what
> the script expects.  So in principle that's an acceptable approach,
> whereas CINE will never be safe.

Well, there can be methods extrinsic to the system for controlling
this sort of thing.  For example, I can provide a script, using CINE,
that will either install version 2 of my app into some database or
that will upgrade an existing version 1 installation to version 2.
It's true that if someone has taken the version-1 schema and made
manual modifications to it, then things might blow up.  But, I can
tell people that they shouldn't do that, or the upgrade script might
break.  If they do and it does then they get to keep both pieces.
Even if I do the whole thing in PL/pgsql, I'm still not going to check
for every stupid thing someone might have done to break the schema...
I think the cat is already out of the bag on this one, and it's just a
matter of whether we're willing to provide some convenient syntax or
leave people to hand-code it.

> But actually I thought we had more or less concluded that CREATE OR
> REPLACE LANGUAGE would be acceptable (perhaps only if it's given
> without any extra args?).

I'm not sure there's any value in that restriction - seems more
confusing than helpful.

> Or for that matter there seems to be enough
> opinion on the side of just installing plpgsql by default.  CINE is
> a markedly inferior alternative to either of those.

For languages, yes.

...Robert

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The argument against CINE is that it's unsafe.

> By no means rhetorically, is that based on the assumption that the
> statement would not validate that the existing object (if any) matches
> the supplied definition?

If it did so, that would be outside the apparent meaning of the
command, which is to do nothing if an object of that name exists.
That's why we've gone with CREATE OR REPLACE instead.

>> The fragment proposed by Andrew is no safer, of course, but it could
>> be made safe by adding additional checks that the properties of the
>> existing object are what the script expects.

> Again, not rhetorically, is that assuming an error-free mapping of the
> CREATE statement to all the related system tables -- each time it is
> written by every user, individually?

Yes, I'd expect the user to custom-code it, because it's not clear
exactly which properties the script would be depending on and which ones
it's okay to allow to vary.  To take just one example, is it okay if the
object ownership is different from current user?  That might be fine,
or it might be catastrophic (suppose the script is going to issue GRANT
commands that presuppose particular ownership; if it's different you
could be left with security holes).

> Only with the most simplistic implementation of CINE.  I really don't
> see how that assertion holds up if there is checking of the supplied
> definition against the existing object.  Even the most simplistic
> definition is arguably safer than CREATE OR REPLACE, since that can
> destroy existing data.

How exactly would it do that?  You seem to be postulating non-obvious
or not-as-currently-implemented semantics for both variants of the
command, so you had better explain exactly what you think they'd be.

(I agree that CREATE OR REPLACE on a table might be expected to destroy
existing data, but we don't have such a command and there is no proposal
to make one.)

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> But actually I thought we had more or less concluded that CREATE OR
>> REPLACE LANGUAGE would be acceptable (perhaps only if it's given
>> without any extra args?).

> I'm not sure there's any value in that restriction - seems more
> confusing than helpful.

The point would be to reduce the risk that you're changing the language
definition in a surprising way.  Extra args would imply that you're
trying to install a non-default definition of the language.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Scott Marlowe
Date:
On Tue, Nov 24, 2009 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Nov 24, 2009 at 12:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> But actually I thought we had more or less concluded that CREATE OR
>>> REPLACE LANGUAGE would be acceptable (perhaps only if it's given
>>> without any extra args?).
>
>> I'm not sure there's any value in that restriction - seems more
>> confusing than helpful.
>
> The point would be to reduce the risk that you're changing the language
> definition in a surprising way.  Extra args would imply that you're
> trying to install a non-default definition of the language.

But if you'd installed it that way before, wouldn't you then need the
arguments this time to have them match?

Re: [HACKERS] Updating column on row update

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> If it did so, that would be outside the apparent meaning of the
> command, which is to do nothing if an object of that name exists.
> That's why we've gone with CREATE OR REPLACE instead.

I think that "fail on existence of an object conflicting with given
definition" is behavior which could be documented and rates fairly
low on my astonishment scale.  (I can't speak for anyone else.)

I am skeptical that, in the absence of built-in support for checking
the existing object against the supplied definition, people would
generally go any further than Andrew's example.  When they did, I'm
skeptical about how often they would get the details exactly right.

> Yes, I'd expect the user to custom-code it, because it's not clear
> exactly which properties the script would be depending on and which
> ones it's okay to allow to vary.  To take just one example, is it
> okay if the object ownership is different from current user?  That
> might be fine, or it might be catastrophic (suppose the script is
> going to issue GRANT commands that presuppose particular ownership;
> if it's different you could be left with security holes).

Yeah, that's an area which I figured would require some discussion.
The best behavior isn't immediately clear to me in that regard.  I
didn't figure that arriving at some decision on that was necessarily
an insurmountable obstacle.  Similar issue with indexes, although the
answer there seems clearer (at least to me).

> (I agree that CREATE OR REPLACE on a table might be expected to
> destroy existing data, but we don't have such a command and there is
> no proposal to make one.)

There was, up-thread, discussion by multiple people of the desire to
have CINE for tables.  Andrew's example was specifically about an
alternative way of spelling that.  This branch of the thread has been
all about exactly that.  (Well, at least in my head.)  You asserted
that CREATE OR REPLACE was superior to CINE; I took it to be in
response to the discussion of CINE for tables, but I guess it was
just in the scope of languages.  Sorry for misinterpreting.

-Kevin

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
Scott Marlowe <scott.marlowe@gmail.com> writes:
> On Tue, Nov 24, 2009 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The point would be to reduce the risk that you're changing the language
>> definition in a surprising way. �Extra args would imply that you're
>> trying to install a non-default definition of the language.

> But if you'd installed it that way before, wouldn't you then need the
> arguments this time to have them match?

If you knew you'd installed it that way before, you wouldn't be
executing this command at all.  The use-case for commands like this
IMO is scripts that don't know exactly what the database state is.
The use-case for a script that is installing non-default language
parameters into somebody else's database seems pretty darn thin.

I'm not dead set on this by any means.  But it seems like it would
help reduce the risk of bad consequences from CREATE OR REPLACE
LANGUAGE.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yes, I'd expect the user to custom-code it, because it's not clear
>> exactly which properties the script would be depending on and which
>> ones it's okay to allow to vary.  To take just one example, is it
>> okay if the object ownership is different from current user?

> Yeah, that's an area which I figured would require some discussion.
> The best behavior isn't immediately clear to me in that regard.  I
> didn't figure that arriving at some decision on that was necessarily
> an insurmountable obstacle.

The reason a script-driven solution seems attractive is exactly that
there doesn't seem to be a good one-size-fits-all behavior for complex
objects.

> There was, up-thread, discussion by multiple people of the desire to
> have CINE for tables.  Andrew's example was specifically about an
> alternative way of spelling that.  This branch of the thread has been
> all about exactly that.  (Well, at least in my head.)

I thought the thread was about CREATE LANGUAGE.  If you want to discuss
CINE in general it would probably be appropriate to start a different
thread about that.

            regards, tom lane

Re: [HACKERS] Updating column on row update

From
Robert Haas
Date:
On Tue, Nov 24, 2009 at 2:07 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> If it did so, that would be outside the apparent meaning of the
>> command, which is to do nothing if an object of that name exists.
>> That's why we've gone with CREATE OR REPLACE instead.
>
> I think that "fail on existence of an object conflicting with given
> definition" is behavior which could be documented and rates fairly
> low on my astonishment scale.  (I can't speak for anyone else.)

I think CINE should create the object if it does not exist and
otherwise do nothing.  It might be useful to have some kind of
consistency-checking behavior, but it would probably be more useful if
decoupled from CINE, and in any case, that's not what "CREATE IF NOT
EXISTS" means to me.

> I am skeptical that, in the absence of built-in support for checking
> the existing object against the supplied definition, people would
> generally go any further than Andrew's example.  When they did, I'm
> skeptical about how often they would get the details exactly right.

Bingo.

...Robert

Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Tom Lane wrote:
> But actually I thought we had more or less concluded that CREATE OR
> REPLACE LANGUAGE would be acceptable (perhaps only if it's given
> without any extra args?).  Or for that matter there seems to be enough
> opinion on the side of just installing plpgsql by default.  CINE is
> a markedly inferior alternative to either of those.

Based on research done as part of this thread, it seems plpgsql has
similar risks to recursive queries, so the idea of installing plpgsql by
default now makes more sense.

The attached patch installs plpgsql language by default, as well as the
three plpgsql helper functions.  The language is installed just like it
was before, but now automatically, e.g. still a separate shared object.
One problem is that because system oids are used, it isn't possible to
drop the language:

    $ droplang plpgsql test
    droplang: language removal failed: ERROR:  cannot drop language plpgsql
    because it is required by the database system

I assume we still want to allow the language to be uninstalled, for
security purposes.  Right?  Any suggestions?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/installation.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v
retrieving revision 1.328
diff -c -c -r1.328 installation.sgml
*** doc/src/sgml/installation.sgml    2 Dec 2009 14:07:25 -0000    1.328
--- doc/src/sgml/installation.sgml    3 Dec 2009 23:09:59 -0000
***************
*** 2257,2270 ****
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
--- 2257,2270 ----
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
Index: src/include/catalog/pg_language.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/catalog/pg_language.h,v
retrieving revision 1.35
diff -c -c -r1.35 pg_language.h
*** src/include/catalog/pg_language.h    22 Sep 2009 23:43:41 -0000    1.35
--- src/include/catalog/pg_language.h    3 Dec 2009 23:09:59 -0000
***************
*** 75,79 ****
--- 75,82 ----
  DATA(insert OID = 14 ( "sql"        PGUID f t 0 0 2248 _null_ ));
  DESCR("SQL-language functions");
  #define SQLlanguageId 14
+ DATA(insert OID = 9 ( "plpgsql"        PGUID t t 2995 2996 2997 _null_ ));
+ DESCR("SQL-language functions");
+

  #endif   /* PG_LANGUAGE_H */
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.554
diff -c -c -r1.554 pg_proc.h
*** src/include/catalog/pg_proc.h    29 Nov 2009 18:14:30 -0000    1.554
--- src/include/catalog/pg_proc.h    3 Dec 2009 23:10:03 -0000
***************
*** 4722,4727 ****
--- 4722,4734 ----
  DATA(insert OID = 3114 (  nth_value        PGNSP PGUID 12 1 0 0 f t f t f i 2 0 2283 "2283 23" _null_ _null_ _null_
_null_window_nth_value _null_ _null_ _null_ )); 
  DESCR("fetch the Nth row value");

+ /* PL/pgSQL support functions */
+ DATA(insert OID = 2995 (  plpgsql_call_handler    PGNSP PGUID 13 1 0 0 f f f f f v 0 0 2280 "" _null_ _null_ _null_
_null_plpgsql_call_handler "$libdir/plpgsql" _null_ _null_ )); 
+ DESCR("PL/pgSQL function/trigger manager");
+ DATA(insert OID = 2996 (  plpgsql_inline_handler    PGNSP PGUID 13 1 0 0 f f f t f v 1 0 2278 2281 _null_ _null_
_null__null_ plpgsql_inline_handler "$libdir/plpgsql" _null_ _null_ )); 
+ DESCR("PL/pgSQL anonymous code block executor");
+ DATA(insert OID = 2997 (  plpgsql_validator    PGNSP PGUID 13 1 0 0 f f f t f v 1 0 2278 26 _null_ _null_ _null_
_null_plpgsql_validator "$libdir/plpgsql" _null_ _null_ )); 
+ DESCR("PL/pgSQL function validator");

  /*
   * Symbolic values for provolatile column: these indicate whether the result
Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.79
diff -c -c -r1.79 GNUmakefile
*** src/test/regress/GNUmakefile    26 Oct 2009 21:11:22 -0000    1.79
--- src/test/regress/GNUmakefile    3 Dec 2009 23:10:03 -0000
***************
*** 138,144 ****
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) --load-language=plpgsql
$(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 
--- 138,144 ----
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) $(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 

Re: Installing PL/pgSQL by default

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> One problem is that because system oids are used, it isn't possible to
> drop the language:
> I assume we still want to allow the language to be uninstalled, for
> security purposes.

Yes.  That behavior is not acceptable.  Why aren't you just adding
a CREATE LANGUAGE call in one of the initdb scripts?

            regards, tom lane

Re: Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > One problem is that because system oids are used, it isn't possible to
> > drop the language:
> > I assume we still want to allow the language to be uninstalled, for
> > security purposes.
>
> Yes.  That behavior is not acceptable.  Why aren't you just adding
> a CREATE LANGUAGE call in one of the initdb scripts?

Which scripts?  initdb.c?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Installing PL/pgSQL by default

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>
>> One problem is that because system oids are used, it isn't possible to
>> drop the language:
>> I assume we still want to allow the language to be uninstalled, for
>> security purposes.
>>
>
> Yes.  That behavior is not acceptable.  Why aren't you just adding
> a CREATE LANGUAGE call in one of the initdb scripts?
>
>
>

Before we go too far with this, I'd like to know how we will handle the
problems outlined here:
<http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>

cheers

andrew

Re: Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> >
> >> One problem is that because system oids are used, it isn't possible to
> >> drop the language:
> >> I assume we still want to allow the language to be uninstalled, for
> >> security purposes.
> >>
> >
> > Yes.  That behavior is not acceptable.  Why aren't you just adding
> > a CREATE LANGUAGE call in one of the initdb scripts?
> >
> >
> >
>
> Before we go too far with this, I'd like to know how we will handle the
> problems outlined here:
> <http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>

Oh, I forgot about that issue.  FYI, I believe several packages of
Postgres already pre-install plpgsql, or at least allow it as an option.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Installing PL/pgSQL by default

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Before we go too far with this, I'd like to know how we will handle the
> problems outlined here:
> <http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>

Hm, I think that's only a problem if we define it to be a problem,
and I'm not sure it's necessary to do so.  Currently, access to PL
languages is controlled by superusers.  You are suggesting that if
plpgsql is installed by default, then access to it should be controlled
by non-superuser DB owners instead.  Why do we have to move the
goalposts in that direction?  It's not like we expect that DB owners
should control access to other built-in facilities, like int8 or
pg_stat_activity for example.  The argument against having plpgsql
always available is essentially one of security risks, and I would
expect that most installations think that security risks are to be
managed by superusers.

            regards, tom lane

Re: [HACKERS] Installing PL/pgSQL by default

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 > Andrew Dunstan <andrew@dunslane.net> writes:
 >> Before we go too far with this, I'd like to know how we will handle the
 >> problems outlined here:
 >> <http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>

 Tom> Hm, I think that's only a problem if we define it to be a
 Tom> problem, and I'm not sure it's necessary to do so.  Currently,
 Tom> access to PL languages is controlled by superusers.  You are
 Tom> suggesting that if plpgsql is installed by default, then access
 Tom> to it should be controlled by non-superuser DB owners instead.

Currently, a non-superuser db owner can install plpgsql, and having
installed it, can DROP it or grant/revoke access to it:

test=> create language plpgsql;
CREATE LANGUAGE
test=> revoke usage on language plpgsql from public;
REVOKE
test=> drop language plpgsql;
DROP LANGUAGE

The complaint is that if plpgsql is installed by default, then it will
be owned by postgres rather than by the db owner, who will then not be
able to drop it or use grant/revoke on it.

(This only became an issue since the ability for non-superuser DB owners
to install languages from pltemplate was added.)

--
Andrew (irc:RhodiumToad)

Re: Installing PL/pgSQL by default

From
Jasen Betts
Date:
On 2009-12-04, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >> Before we go too far with this, I'd like to know how we will handle the
> >> problems outlined here:
> >> <http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>
>
> Tom> Hm, I think that's only a problem if we define it to be a
> Tom> problem, and I'm not sure it's necessary to do so.  Currently,
> Tom> access to PL languages is controlled by superusers.  You are
> Tom> suggesting that if plpgsql is installed by default, then access
> Tom> to it should be controlled by non-superuser DB owners instead.
>
> Currently, a non-superuser db owner can install plpgsql, and having
> installed it, can DROP it or grant/revoke access to it:
>
> test=> create language plpgsql;
> CREATE LANGUAGE
> test=> revoke usage on language plpgsql from public;
> REVOKE
> test=> drop language plpgsql;
> DROP LANGUAGE
>
> The complaint is that if plpgsql is installed by default, then it will
> be owned by postgres rather than by the db owner, who will then not be
> able to drop it or use grant/revoke on it.

The same problem is had with schema public...



Re: Installing PL/pgSQL by default

From
Alvaro Herrera
Date:
Jasen Betts wrote:
> On 2009-12-04, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

> > The complaint is that if plpgsql is installed by default, then it will
> > be owned by postgres rather than by the db owner, who will then not be
> > able to drop it or use grant/revoke on it.
>
> The same problem is had with schema public...

It seems to me that the solution to both problems is to be able to run
some script after database creation to fix permissions of such things.
This was proposed eons ago by Fabien Coelho IIRC but rejected because no
way to implement it was found.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [HACKERS] Installing PL/pgSQL by default

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Hm, I think that's only a problem if we define it to be a
>  Tom> problem, and I'm not sure it's necessary to do so.

> The complaint is that if plpgsql is installed by default, then it will
> be owned by postgres rather than by the db owner, who will then not be
> able to drop it or use grant/revoke on it.

Right, just like every other thing that's pre-installed.  If a
particular installation wishes to let individual DB owners control this,
the superuser can drop plpgsql from template1.  It's not apparent to me
why we need to allow non-superusers to override the project's decisions
about what should be installed by default.

            regards, tom lane

Re: [HACKERS] Installing PL/pgSQL by default

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Right, just like every other thing that's pre-installed.  If a
> particular installation wishes to let individual DB owners control this,
> the superuser can drop plpgsql from template1.  It's not apparent to me
> why we need to allow non-superusers to override the project's decisions
> about what should be installed by default.

I guess it boils down to being nice to hosting platforms, where they
will want to give as much power as can be given to database owners
without having those hosted people be superusers.

So should the decision to remove plpgsql be on the hosting platform
hands or the hosted database owner?

Regards,
--
dim

Re: [HACKERS] Installing PL/pgSQL by default

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> So should the decision to remove plpgsql be on the hosting platform
> hands or the hosted database owner?

Why not?  If they really want to prohibit use of a feature the upstream
project has decided should be standard, that's their privilege.
The argument against seems to be basically "this should work exactly
like it did before", but if that's the standard then we can never
have plpgsql installed by default at all.

            regards, tom lane

Re: [HACKERS] Installing PL/pgSQL by default

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Why not?  If they really want to prohibit use of a feature the upstream
> project has decided should be standard, that's their privilege.

Well, I guess they could also automate their database creation to fix
the privileges and assign the ownership of the language to the owner of
the database. Then whether or not to have plpgsql there is up to the
owner.

For non-hosted environments, you always want to tweak some things, like
installing plpgsql in the first place. So...

> The argument against seems to be basically "this should work exactly
> like it did before", but if that's the standard then we can never
> have plpgsql installed by default at all.

Don't get me wrong, I'm all for having plpgsql installed by default.

I though we were talking about how to provide that and trying to decide
if having to be superuser to drop plpgsql after having created the
database is blocking the way forward, knowing than installing the
language only requires being database owner.

Regards,
--
dim

Re: Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Before we go too far with this, I'd like to know how we will handle the
> > problems outlined here:
> > <http://archives.postgresql.org/pgsql-hackers/2008-02/msg00916.php>
>
> Hm, I think that's only a problem if we define it to be a problem,
> and I'm not sure it's necessary to do so.  Currently, access to PL
> languages is controlled by superusers.  You are suggesting that if
> plpgsql is installed by default, then access to it should be controlled
> by non-superuser DB owners instead.  Why do we have to move the
> goalposts in that direction?  It's not like we expect that DB owners
> should control access to other built-in facilities, like int8 or
> pg_stat_activity for example.  The argument against having plpgsql
> always available is essentially one of security risks, and I would
> expect that most installations think that security risks are to be
> managed by superusers.

I installed PL/pgSQL by default via initdb with the attached patch.  The
only problem is that pg_dump still dumps out the language creation:

    CREATE PROCEDURAL LANGUAGE plpgsql;
    ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;

What is odd is that I used the same process that initdb uses to create
other objects.  Does anyone know why this is happening?

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/installation.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v
retrieving revision 1.331
diff -c -c -r1.331 installation.sgml
*** doc/src/sgml/installation.sgml    8 Dec 2009 20:08:30 -0000    1.331
--- doc/src/sgml/installation.sgml    9 Dec 2009 02:02:33 -0000
***************
*** 2262,2275 ****
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
--- 2262,2275 ----
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
Index: src/bin/initdb/initdb.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.177
diff -c -c -r1.177 initdb.c
*** src/bin/initdb/initdb.c    14 Nov 2009 15:39:36 -0000    1.177
--- src/bin/initdb/initdb.c    9 Dec 2009 02:02:36 -0000
***************
*** 176,181 ****
--- 176,182 ----
  static void setup_privileges(void);
  static void set_info_version(void);
  static void setup_schema(void);
+ static void load_plpgsql(void);
  static void vacuum_db(void);
  static void make_template0(void);
  static void make_postgres(void);
***************
*** 1893,1898 ****
--- 1894,1924 ----
  }

  /*
+  * load PL/pgsql server-side language
+  */
+ static void
+ load_plpgsql(void)
+ {
+     PG_CMD_DECL;
+
+     fputs(_("loading PL/pgSQL server-side language ... "), stdout);
+     fflush(stdout);
+
+     snprintf(cmd, sizeof(cmd),
+              "\"%s\" %s template1 >%s",
+              backend_exec, backend_options,
+              DEVNULL);
+
+     PG_CMD_OPEN;
+
+     PG_CMD_PUTS("CREATE LANGUAGE plpgsql;\n");
+
+     PG_CMD_CLOSE;
+
+     check_ok();
+ }
+
+ /*
   * clean everything up in template1
   */
  static void
***************
*** 3125,3130 ****
--- 3151,3158 ----

      setup_schema();

+     load_plpgsql();
+
      vacuum_db();

      make_template0();
Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.79
diff -c -c -r1.79 GNUmakefile
*** src/test/regress/GNUmakefile    26 Oct 2009 21:11:22 -0000    1.79
--- src/test/regress/GNUmakefile    9 Dec 2009 02:02:37 -0000
***************
*** 138,144 ****
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) --load-language=plpgsql
$(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 
--- 138,144 ----
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) $(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 

Re: Installing PL/pgSQL by default

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I installed PL/pgSQL by default via initdb with the attached patch.  The
> only problem is that pg_dump still dumps out the language creation:
>     CREATE PROCEDURAL LANGUAGE plpgsql;
>     ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;
> What is odd is that I used the same process that initdb uses to create
> other objects.  Does anyone know why this is happening?

I think pg_dump pays attention to what schema the objects are in,
and that's most likely creating them in PUBLIC.  Try adding
"set search_path = pg_catalog".

It's not impossible that we'll have to tweak pg_dump a bit; it's
never had to deal with languages that shouldn't be dumped ...

            regards, tom lane

Re: [HACKERS] Installing PL/pgSQL by default

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> It's not impossible that we'll have to tweak pg_dump a bit; it's
> never had to deal with languages that shouldn't be dumped ...

Ah, the best would be to have extensions maybe. Then you could do this
in initdb, filling in template0:
  CREATE EXTENSION plpgsql ...;

Then at createdb time, what would become automatic is:
  INSTALL EXTENSION plpgsql;

And that's it. pg_dump would now about extensions and only issues this
latter statement in its dump.

Bruce, there are some mails in the archive with quite advanced design
proposal that has been discussed and not objected to, and I even
provided a rough sketch of how I wanted to attack the problem.

  http://archives.postgresql.org/pgsql-hackers/2009-06/msg01281.php
  http://archives.postgresql.org/pgsql-hackers/2009-07/msg01425.php
  http://archives.postgresql.org/pgsql-hackers/2009-07/msg01468.php

  The major version dependant SQL code is now much less of a problem
  than before because we have inline DO statements. So you don't need to
  create a function for this anymore.

Real life kept me away from having the time to prepare the code patch,
and I don't think that will change a bit in the 8.5 release cycle,
whatever my hopes were earlier this year.

But having everyone talk about the feature and come to an agreement as
to what it should provide and how was the hard part of it, I think, and
is now about done.

Would you be up for writing the extension facility?

Regards,
--
dim

Re: [HACKERS] Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > It's not impossible that we'll have to tweak pg_dump a bit; it's
> > never had to deal with languages that shouldn't be dumped ...
>
> Ah, the best would be to have extensions maybe. Then you could do this
> in initdb, filling in template0:
>   CREATE EXTENSION plpgsql ...;
>
> Then at createdb time, what would become automatic is:
>   INSTALL EXTENSION plpgsql;
>
> And that's it. pg_dump would now about extensions and only issues this
> latter statement in its dump.
>
> Bruce, there are some mails in the archive with quite advanced design
> proposal that has been discussed and not objected to, and I even
> provided a rough sketch of how I wanted to attack the problem.
>
>   http://archives.postgresql.org/pgsql-hackers/2009-06/msg01281.php
>   http://archives.postgresql.org/pgsql-hackers/2009-07/msg01425.php
>   http://archives.postgresql.org/pgsql-hackers/2009-07/msg01468.php
>
>   The major version dependant SQL code is now much less of a problem
>   than before because we have inline DO statements. So you don't need to
>   create a function for this anymore.
>
> Real life kept me away from having the time to prepare the code patch,
> and I don't think that will change a bit in the 8.5 release cycle,
> whatever my hopes were earlier this year.
>
> But having everyone talk about the feature and come to an agreement as
> to what it should provide and how was the hard part of it, I think, and
> is now about done.
>
> Would you be up for writing the extension facility?

Uh, well, I need to help with the patch commit process at this point ---
if I find I have extra time, I could do it.   I will keep this in mind.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I installed PL/pgSQL by default via initdb with the attached patch.  The
> > only problem is that pg_dump still dumps out the language creation:
> >     CREATE PROCEDURAL LANGUAGE plpgsql;
> >     ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;
> > What is odd is that I used the same process that initdb uses to create
> > other objects.  Does anyone know why this is happening?
>
> I think pg_dump pays attention to what schema the objects are in,
> and that's most likely creating them in PUBLIC.  Try adding
> "set search_path = pg_catalog".
>
> It's not impossible that we'll have to tweak pg_dump a bit; it's
> never had to deal with languages that shouldn't be dumped ...

I found that pg_dump tests for pg_language.lanispl == true, which is
true for all the stored procedure languages.  I can easily special case
plpgsql, or check for FirstNormalObjectId, though I don't see that used
in pg_dump currently.

A more difficult issue is whether we should preserve the fact that
plpgsql was _removed_ in the pg_dump output, i.e, if someone removes
plpgsql from a database, do we issue a DROP LANGUAGE in pg_dump?  I
don't remember us having to deal with anything like this before.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: Installing PL/pgSQL by default

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> A more difficult issue is whether we should preserve the fact that
> plpgsql was _removed_ in the pg_dump output, i.e, if someone removes
> plpgsql from a database, do we issue a DROP LANGUAGE in pg_dump?  I
> don't remember us having to deal with anything like this before.

The public schema is a precedent.

            regards, tom lane

Re: [HACKERS] Installing PL/pgSQL by default

From
Dimitri Fontaine
Date:
Hi,

Le 11 déc. 2009 à 01:43, Bruce Momjian a écrit :
>> Would you be up for writing the extension facility?
>
> Uh, well, I need to help with the patch commit process at this point ---
> if I find I have extra time, I could do it.   I will keep this in mind.

If you ever find the time to do it, that would be excellent!
The extension facility is on the top list of Josh Berkus missing things we'll have to provide soon (or something) and a
veryannoying missing feature, the last stone of the high praised extensibility of PostgreSQL. 
  http://it.toolbox.com/blogs/database-soup/postgresql-development-priorities-31886

It could also means that pg_migrator would have an easier time handling user data types etc, if extension authors are
ableto implement the hooks to call to migrate their data from previous to next major version ondisk format. 

Anyway, thanks for considering it!
--
dim

PS: of course I've developed some ideas of how I'd like this to work and some use cases too, so bug me on IRC or
whateverfor details etc :) 

Re: Installing PL/pgSQL by default

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > I installed PL/pgSQL by default via initdb with the attached patch.  The
> > > only problem is that pg_dump still dumps out the language creation:
> > >     CREATE PROCEDURAL LANGUAGE plpgsql;
> > >     ALTER PROCEDURAL LANGUAGE plpgsql OWNER TO postgres;
> > > What is odd is that I used the same process that initdb uses to create
> > > other objects.  Does anyone know why this is happening?
> >
> > I think pg_dump pays attention to what schema the objects are in,
> > and that's most likely creating them in PUBLIC.  Try adding
> > "set search_path = pg_catalog".
> >
> > It's not impossible that we'll have to tweak pg_dump a bit; it's
> > never had to deal with languages that shouldn't be dumped ...
>
> I found that pg_dump tests for pg_language.lanispl == true, which is
> true for all the stored procedure languages.  I can easily special case
> plpgsql, or check for FirstNormalObjectId, though I don't see that used
> in pg_dump currently.
>
> A more difficult issue is whether we should preserve the fact that
> plpgsql was _removed_ in the pg_dump output, i.e, if someone removes
> plpgsql from a database, do we issue a DROP LANGUAGE in pg_dump?  I
> don't remember us having to deal with anything like this before.

OK, the attached patch installs plpgsql by default from initdb, and
supresses the dumping of CREATE LANGUAGE in 8.5 and in 8.3/8.4 if binary
upgrade is used (because you know you are upgrading to a release that
has plpgsql installed by default).  The 8.3/8.4 is necessary so the
schema load doesn't generate any errors and cause pg_migrator to exit.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/installation.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v
retrieving revision 1.333
diff -c -c -r1.333 installation.sgml
*** doc/src/sgml/installation.sgml    15 Dec 2009 22:59:53 -0000    1.333
--- doc/src/sgml/installation.sgml    17 Dec 2009 23:35:36 -0000
***************
*** 2266,2279 ****
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plpgsql template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plpgsql.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
--- 2266,2279 ----
       is <command>createlang</command> failing with unusual errors.
       For example, running as the owner of the PostgreSQL installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": A memory
addressis not in the address space for the process. 
  </screen>
      Running as a non-owner in the group posessing the PostgreSQL
      installation:
  <screen>
! -bash-3.00$ createlang plperl template1
! createlang: language installation failed: ERROR:  could not load library "/opt/dbs/pgsql748/lib/plperl.so": Bad
address
  </screen>
       Another example is out of memory errors in the PostgreSQL server
       logs, with every memory allocation near or greater than 256 MB
Index: src/bin/initdb/initdb.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.178
diff -c -c -r1.178 initdb.c
*** src/bin/initdb/initdb.c    11 Dec 2009 03:34:56 -0000    1.178
--- src/bin/initdb/initdb.c    17 Dec 2009 23:35:36 -0000
***************
*** 176,181 ****
--- 176,182 ----
  static void setup_privileges(void);
  static void set_info_version(void);
  static void setup_schema(void);
+ static void load_plpgsql(void);
  static void vacuum_db(void);
  static void make_template0(void);
  static void make_postgres(void);
***************
*** 1894,1899 ****
--- 1895,1925 ----
  }

  /*
+  * load PL/pgsql server-side language
+  */
+ static void
+ load_plpgsql(void)
+ {
+     PG_CMD_DECL;
+
+     fputs(_("loading PL/pgSQL server-side language ... "), stdout);
+     fflush(stdout);
+
+     snprintf(cmd, sizeof(cmd),
+              "\"%s\" %s template1 >%s",
+              backend_exec, backend_options,
+              DEVNULL);
+
+     PG_CMD_OPEN;
+
+     PG_CMD_PUTS("CREATE LANGUAGE plpgsql;\n");
+
+     PG_CMD_CLOSE;
+
+     check_ok();
+ }
+
+ /*
   * clean everything up in template1
   */
  static void
***************
*** 3126,3131 ****
--- 3152,3159 ----

      setup_schema();

+     load_plpgsql();
+
      vacuum_db();

      make_template0();
Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.556
diff -c -c -r1.556 pg_dump.c
*** src/bin/pg_dump/pg_dump.c    14 Dec 2009 00:39:11 -0000    1.556
--- src/bin/pg_dump/pg_dump.c    17 Dec 2009 23:35:36 -0000
***************
*** 32,37 ****
--- 32,38 ----

  #include "access/attnum.h"
  #include "access/sysattr.h"
+ #include "access/transam.h"
  #include "catalog/pg_cast.h"
  #include "catalog/pg_class.h"
  #include "catalog/pg_default_acl.h"
***************
*** 4599,4606 ****
                            "(%s lanowner) AS lanowner "
                            "FROM pg_language "
                            "WHERE lanispl "
                            "ORDER BY oid",
!                           username_subquery);
      }
      else if (g_fout->remoteVersion >= 80300)
      {
--- 4600,4609 ----
                            "(%s lanowner) AS lanowner "
                            "FROM pg_language "
                            "WHERE lanispl "
+                           /* do not dump initdb-installed languages */
+                           "AND oid >= %u "
                            "ORDER BY oid",
!                           username_subquery, FirstNormalObjectId);
      }
      else if (g_fout->remoteVersion >= 80300)
      {
***************
*** 4610,4618 ****
                            "lanvalidator,  lanacl, "
                            "(%s lanowner) AS lanowner "
                            "FROM pg_language "
!                           "WHERE lanispl "
                            "ORDER BY oid",
!                           username_subquery);
      }
      else if (g_fout->remoteVersion >= 80100)
      {
--- 4613,4622 ----
                            "lanvalidator,  lanacl, "
                            "(%s lanowner) AS lanowner "
                            "FROM pg_language "
!                           "WHERE lanispl%s"
                            "ORDER BY oid",
!                           username_subquery,
!                           binary_upgrade ? "\nAND lanname != 'plpgsql'" : "");
      }
      else if (g_fout->remoteVersion >= 80100)
      {
Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.79
diff -c -c -r1.79 GNUmakefile
*** src/test/regress/GNUmakefile    26 Oct 2009 21:11:22 -0000    1.79
--- src/test/regress/GNUmakefile    17 Dec 2009 23:35:38 -0000
***************
*** 138,144 ****
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) --load-language=plpgsql
$(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 
--- 138,144 ----
  ## Run tests
  ##

! pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) $(NOLOCALE)

  check: all
      $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF)