Re: PL/pgSQL 2 - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: PL/pgSQL 2
Date
Msg-id 1409670509.90792.YahooMailNeo@web122302.mail.ne1.yahoo.com
Whole thread Raw
In response to Re: PL/pgSQL 2  (Marko Tiikkaja <marko@joh.to>)
Responses Re: PL/pgSQL 2
Re: PL/pgSQL 2
List pgsql-hackers
Marko Tiikkaja <marko@joh.to> wrote:
> On 9/2/14 4:26 PM, Kevin Grittner wrote:
>> Joel Jacobson <joel@trustly.com> wrote:
>>> The common use-case I have in mind is when you have a function
>>> which takes some kind of ID as an input param, which maps to a
>>> primary key in some table, which you want to update.
>>
>> In that case FOUND works just fine.  A primary key value can't have
>> more than one matching row.
>
> No, but your code can have a bug.

So the main use case is to allow buggy functions which are deployed
to production without adequate testing to be detected?  Bugs like
not getting the primary key column(s) right?  I think it would be
great to have some way to generate an error if a given statement
doesn't affect exactly one row, but the above is a pretty weak
argument for making it a default behavior.

> INTO rejecting any queries returning more than one row helps,
> though, but having to write  RETURNING TRUE INTO _OK;  is not
> pretty either.

No, that sure would not be.

>>> If the where-clause would be incorrect and the update would
>>> update all rows in the table, that would be a disaster, which is
>>> what I want to prevent.
>>
>> By the time you find out that the number of rows affected is every
>> row in the table, you have horribly bloated the table and all its
>> indexes.  Causing a DML statement to abort when it sees a second
>> row is a completely different issue than what I (and I suspect most
>> others on the list) thought we were talking about, and would need
>> to affect far more than the PL.
>
> Updating even two rows instead of one can have catastrophic effects.

That's a different problem than Joel just said was his main
concern.  I was pointing out that the solution he was proposing was
a very poor solution to the problem he said he was trying to solve.
Can you imagine the damage if a function that updated every row in
a table whenever anyone tried to update a single row by primary key
made it past testing and staging phases into production?  Depending
on the table, it might not need to run more than a few times before
the bloat ate all disk space and your production environment was
totally hosed to the point of needing to delete everything from
$PGDATA and restore from your last known good backup.

Accidentally updating a single unintended row is a whole different
class of problem, with potentially completely different solutions.
We can talk about both, but let's not conflate them.  The proposed
new behavior seems like it would only detect a small percentage of
ways you can accidentally update unintended rows, but I agree it
would catch enough of them to be a potentially useful option.  If
it were a new option on the DML statement syntax, once could
certainly have code review or some sort of "lint" software to look
for omissions.  If you don't have a code review process before
things hit production, well, mechanical solutions like this can
only be expected to catch a small percentage of the damage from
application bugs deployed to production.

>>> It's the same type of mistake I want to prevent from in a
>>> convenient way, and there is nothing more convenient than the
>>> default behavour.  That also means *all* users will get that
>>> behaviour even if they don't explicitly request it, which is a
>>> good thing, because then they are protected against the danger of
>>> not knowing how to make sure it updated/deleted only one row.
>>
>> I think that changing the default behavior of SQL from set oriented
>> to something else is a horrible idea.  I absolutely, unequivocally
>> oppose that at the SQL or plpgsql level as harmful.  I understand
>> the need to check for this in various cases, and in fact the
>> application framework I designed at my previous job had Java
>> methods for doing DML with such a check included, named
>> InsertOneRow(), UpdateOneRow(), and DeleteOneRow().  Very useful.
>> If we can agree on a way to allow users to do the same in plpgsql,
>> fine -- but certainly not as the default default (word
>> intentionally repeated).
>
> Yeah, it doesn't necessarily need to be the default default (and I see a
> lot of people saying it shouldn't be).  Even having a per-query modifier
> would be better than the current behaviour.

There we seem to agree.  I definitely think it is a useful option
if we can sort out a good way to allow it.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Christoph Berg
Date:
Subject: Re: ALTER SYSTEM RESET?
Next
From: Neil Tiffin
Date:
Subject: Re: PL/pgSQL 2