Thread: Resolution for "ERROR: cannot handle whole-row reference" ?

Resolution for "ERROR: cannot handle whole-row reference" ?

From
Sean Chittenden
Date:
Hopefully I'm being blind and missing something obvious.  Here's the
scenario:

-- I have a large table/view with a large-ish number of columns
(routinely
-- more than 32 cols) and need to pass every col through to a function,
and
-- am not inclined to recompile with INDEX_MAX_KEYS/FUNC_MAX_ARGS set
-- to 64 unless I *absolutely* have to, and even then, I don't want to
impose
-- an arbitrary limitation that I should be able to get around with the
RECORD
-- type... "in theory."
CREATE TABLE t1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int, c7
int, c8 int, c9 int, c10 int, c11 int, c12 int, c13 int, c14 int, c15
int, c16 int, c17 int, c18 int, c19 int, c20 int, c21 int, c22 int, c23
int, c24 int, c25 int, c26 int, c27 int, c28 int, c29 int, c30 int, c31
int, c32 int, c33 int, c34 int, c35 int, c36 int, c37 int);

CREATE FUNCTION f1(RECORD) RETURNS BOOL LANGUAGE 'plpgsql' AS 'BEGIN
RETURN TRUE; END;';

CREATE RULE t1_ins AS ON INSERT TO t1 DO INSTEAD SELECT f1(NEW);
ERROR:  function f1(t1) does not exist
HINT:  No function matches the given name and argument types. You may
need to add explicit type casts.

-- You can't cast NEW to a RECORD type and there's no member of the
NEW/OLD
-- pseudorelation that'll return a RECORD, so, instead, let me try a
whole row
-- as the argument:
CREATE FUNCTION f1(t1) RETURNS BOOL LANGUAGE 'plpgsql' AS 'BEGIN RETURN
TRUE; END;';
CREATE RULE r_ins AS ON INSERT TO t1 DO INSTEAD SELECT f1(NEW);

INSERT INTO t1 (c1) VALUES (1);
ERROR:  cannot handle whole-row reference


Doh!  Am I missing a different way of handling this in a
flexible/scalable way?  The comment in
src/backend/rewrite/rewriteManip.c:886 leads me to believe that this is
a limitation.  I'd use a trigger, but t1 is actually a VIEW that I'm
trying to intercept commands to with RULEs.  Any guidance in handling
this case is greatly appreciated.  -sc

--
Sean Chittenden


Re: Resolution for "ERROR: cannot handle whole-row reference" ?

From
Tom Lane
Date:
Sean Chittenden <sean@chittenden.org> writes:
> CREATE FUNCTION f1(RECORD) RETURNS BOOL LANGUAGE 'plpgsql' AS 'BEGIN
> RETURN TRUE; END;';

You can't use RECORD as the declaration of a plpgsql function parameter,
only as its result type.  (Although I wonder whether we couldn't treat
it as a sort of anyarray-like pseudo-type ... but that's for the future.)

What you want is to declare the function as taking the table rowtype:

regression=# create table t1(f1 int, f2 text, f3 int);
CREATE TABLE
regression=# insert into t1 values(42,'z',44);
INSERT 1259013 1
regression=# create function foo(t1) returns int as '
regression'# begin
regression'#   return $1.f1;
regression'# end' language plpgsql;
CREATE FUNCTION
regression=# select foo(t1.*) from t1;
 foo
-----
  42
(1 row)

BTW, if you try

regression=# create function f1(t1) returns int as '
regression'# begin
regression'#   return $1.f1;
regression'# end' language plpgsql;
ERROR:  "f1" is already an attribute of type t1

The reason for this is a historical behavior inherited from PostQUEL:
a function invocation like "f1(t1.*)" can also be written "t1.f1", so
the function name can't conflict with any column name of the table.
However this only applies to single-argument functions, so if you have
other things to pass in besides the row, there's no problem.

> CREATE RULE t1_ins AS ON INSERT TO t1 DO INSTEAD SELECT f1(NEW);
> ERROR:  function f1(t1) does not exist

I am not sure any of this works for NEW or OLD references in rules,
though, because they're not really tables.

            regards, tom lane

Re: Resolution for "ERROR: cannot handle whole-row reference" ?

From
Sean Chittenden
Date:
[snip function bits]

The first part I knew, but the historical behavior mentioned is
interesting... I haven't run into a naming conflict yet, but will
probably change things to preemptively thwart such problems.  This
oddity seems pretty unknown and certainly not something I recall having
read about... what are the odds that this will be changed in the
future?  Low, very low, never, or someday if there's time/effort?

>> CREATE RULE t1_ins AS ON INSERT TO t1 DO INSTEAD SELECT f1(NEW);
>> ERROR:  function f1(t1) does not exist
>
> I am not sure any of this works for NEW or OLD references in rules,
> though, because they're not really tables.

This is my real problem though.  For example, I was doing something
similar to:

-- v1 is a view of the table t1. I've got a few cases now where I've got
-- around 40 columns.
CREATE VIEW v1 AS SELECT c1, c2, c3, c4 FROM t1;
CREATE RULE v1_ins AS ON INSERT TO v1 DO INSTEAD SELECT t1_ins(NEW.c1,
NEW.c2, NEW.c3, NEW.c4);

But then I've run into a few legitimate tables that have more than
32cols and ran into the max function arg limit.  I was able to test and
change t1_ins() to accept a table rowtype, but am now hung because I
can't pass the NEW/OLD pseudorelational as a rowtype.  It's clearly a
complex problem to have the rewrite engine handle this correctly in
that I don't know how the database could resolve the NEW
pseudorelational for an insert into v1 as a table rowtype for t1.  I'm
lucky right now in that v1 is essentially defined as SELECT * FROM t1
so the mapping seems intuitively easy, but that's deceptive.  I would,
however, think that the NEW pseudorelational is structurally identical
to a RECORD type in that a RECORD type has an unknown/dynamic form...
which is ideally suited for my needs, but doesn't work.

Here's my creative - *ugly as sin*, should be shot for doing this,
gagged I was laughing so hard while writing this evilness - workaround.

CREATE RULE v1_ins AS ON INSERT TO v1 DO INSTEAD (
     SELECT check_or_create_tmp_table('t1');
     SELECT t1_ins_1(NEW.c1, NEW.c2, NEW.c3, /* ... */ NEW.c32);
     SELECT t1_ins_2(NEW.c33, NEW.c34, /* ... */ NEW.c42);
     SELECT t1_ins_final());

check_or_create_tmp_table(TEXT) finds the table in $1 and creates a
TEMP table mirror of $1, except the mirror accepts NULL values.  If it
does exist, check_or_create_tmp_table() TRUNCATEs the TEMP TABLE.
Then, t1_ins1_1() dumps all 32 of its args into the temp table and
t1_ins_2() dumps does the same using an unqualified UPDATE (there's
only one row in there anyway) to fill in the missing rows.  Repeat
(#cols in table / FUNC_MAX_ARGS) + 1 times.  Finally, in
t1_ins_final(), I scoop up the row in the temp table and 'INSERT INTO
t1 SELECT * FROM t1_tmp'.

Scrub and repeat for UPDATE and DELETE RULEs.  Evil?  Absolutely.
Inefficient?  Yup.  Does it work?  I'm embarrassed to say, but yes...
sadly it does.  Is it brittle?  You bet!

Ideally I'd rather do CREATE FUNCTION t1_ins(RECORD) or whatever a RULE
pseudorelational NEW/OLD could be passed as, then blindly do CREATE
RULE v1_ins AS ON INSERT TO v1 DO INSTEAD SELECT t1_ins(NEW); and not
have to worry about t1 or v1's structure changing.

Is there a less evil way of doing what I ended up doing for views with
more than FUNC_MAX_ARGS cols?  -sc

--
Sean Chittenden


Re: Resolution for "ERROR: cannot handle whole-row reference" ?

From
Tom Lane
Date:
Sean Chittenden <sean@chittenden.org> writes:
> The first part I knew, but the historical behavior mentioned is
> interesting... I haven't run into a naming conflict yet, but will
> probably change things to preemptively thwart such problems.  This
> oddity seems pretty unknown and certainly not something I recall having
> read about... what are the odds that this will be changed in the
> future?  Low, very low, never, or someday if there's time/effort?

It is documented, see "33.4.2. SQL Functions on Composite Types"
about halfway down this page:
http://www.postgresql.org/docs/7.4/static/xfunc-sql.html#AEN28791

I'm not really inclined to rip it out unless we get complaints, which
AFAIR there have been hardly any of.  One could argue that this is a
useful feature, since it essentially allows you to build columns that
are computed on-the-fly from other columns.  I believe some other DBMSes
tout such things as features ;-)

> It's clearly a complex problem to have the rewrite engine handle this
> correctly in that I don't know how the database could resolve the NEW
> pseudorelational for an insert into v1 as a table rowtype for t1.

Well, it wouldn't; you'd need to declare the function parameter as v1's
rowtype not t1's.  RECORD might be handy as a means of only having to
write one function for several similar problems --- it'd be exactly a
polymorphic-function facility.  But it's not essential.

What we do need is a cleaner way of handling whole-row variables inside
the execution engine.  The present coding is crufty, restrictive, and
leaks memory :-(.

What would also be needed to solve the particular problem you are
hitting is a "row constructor" runtime construct, comparable to the
ARRAY[] construct that Joe Conway created recently for arrays.  Then
the rule rewriter could expand an insert rule's "NEW.*" into a ROW[]
construct with the actual expressions from the rewritten query inside.
The SQL spec has something sort of like this in its VALUES() construct,
but it doesn't allow for associating a particular named type with the
row, which means it's not quite what we need.

            regards, tom lane