Re: Re: [SQL] MAX() of 0 records. - Mailing list pgsql-hackers

From Chris Bitmead
Subject Re: Re: [SQL] MAX() of 0 records.
Date
Msg-id 3965F8A6.3322C001@bitmead.com
Whole thread Raw
In response to Re: [SQL] MAX() of 0 records.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [BUGS] Re: Re: [SQL] MAX() of 0 records.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Another observation is that if the WHERE clause is successful, it seems
to update the first record in the target relation that it finds which is
a pretty random result.

pghack=# create table e(ee text, eee integer);
CREATE
pghack=# create table f(ff text, fff integer);
CREATE
pghack=# insert into e values('e', 1);
INSERT 18871 1
pghack=# insert into e values('ee', 2);
INSERT 18872 1
pghack=# insert into e values('eee', 3);
INSERT 18873 1
pghack=# insert into f values('fff', 3);
INSERT 18874 1
pghack=# insert into f values('ff', 2);
INSERT 18875 1
pghack=# insert into f values('f', 1);
INSERT 18876 1
pghack=# update e set eee=min(f.fff) from f;
UPDATE 1
pghack=# select * from e;ee  | eee 
-----+-----ee  |   2eee |   3e   |   1
(3 rows)

pghack=# select min(f.fff) from f;min 
-----  1
(1 row)

pghack=# update e set eee=min(f.fff) from f;
UPDATE 1
pghack=# select min(f.fff) from f;min 
-----  1
(1 row)

pghack=# select * from e;ee  | eee 
-----+-----eee |   3e   |   1ee  |   1
(3 rows)

pghack=# update e set eee=min(f.fff) from f;
UPDATE 1
pghack=# select * from e;ee  | eee 
-----+-----e   |   1ee  |   1eee |   1
(3 rows)




Tom Lane wrote:
> 
> Paul McGarry <paulm@opentec.com.au> writes:
> > CREATE TABLE entry_stats
> > (
> >     entry_id INT4 NOT NULL REFERENCES entry ON DELETE CASCADE,
> >     entry_minprice INT4 NOT NULL DEFAULT 0
> > );
> >
> > CREATE TABLE item(
> >     item_id INT4 PRIMARY KEY,
> >     item_entry_id INT4 NOT NULL REFERENCES entry ON DELETE NO ACTION,
> >     item_price INT4 NOT NULL,
> >     item_live bool NOT NULL DEFAULT 'n'
> > );
> >
> > [trigger using]
> >
> >     UPDATE entry_stats
> >     SET entry_minprice=min(item_price)
> >     FROM item where item_entry_id=NEW.item_entry_id AND item_live='t';
> >
> > ERROR:  ExecutePlan: (junk) `ctid' is NULL!
> 
> Hmm.  There are several things going on here, but one thing that needs
> clarification is whether this UPDATE is written correctly.  Since it
> has no constraint on entry_stats, it seems to me that *every* row of
> entry_stats will have entry_minprice set to the same value, namely
> the minimum item_price over those item rows that satisfy the WHERE
> condition.  Surely that wasn't what you wanted?  Shouldn't there be an
> additional WHERE clause like entry_id = item_entry_id?
> 
> Anyway, the proximate cause of the error message is as follows.
> A cross-table UPDATE like this is actually implemented as if it were
> a SELECT:
>         SELECT entry_stats.ctid, min(item_price)
>         FROM entry_stats, item WHERE ...;
> For each row emitted by this underlying SELECT, the executor takes
> the ctid result column (which identifies the particular target tuple
> in the target table) and updates that tuple by stuffing the additional
> SELECT result column(s) into the specified fields of that tuple.
> 
> Now, if you try a SELECT like the above in a situation where there are
> no tuples matching the WHERE clause, what you get out is a row of all
> NULLs --- because that's what you get from SELECT if there's an
> aggregate function with no GROUP BY and no input rows.  The executor
> gets this dummy row, tries to do a tuple update using it, and chokes
> because the ctid is NULL.  So that explains why the error message is
> what it is.  Next question is what if anything should be done
> differently.  We could just have the executor ignore result rows where
> ctid is NULL, but that seems like patching around the problem not fixing
> it.
> 
> The thing that jumps out at me is that if you actually try the SELECT
> illustrated above, you do not get any row, null or otherwise; you get
> ERROR:  Attribute entry_stats.ctid must be GROUPed or used in an
>         aggregate function
> which is a mighty valid complaint.  If you are aggregating rows to get
> the MIN() then you don't have a unique ctid to deliver, so which row
> ought to be updated?  This is the system's way of expressing the same
> concern I started with: this query doesn't seem to be well-posed.
> 
> You don't see this complaint when you try the UPDATE, because ctid
> is added to the implicit select result in a back-door way that doesn't
> get checked for GROUP-BY validity.  I wonder whether that is the bug.
> If so, we'd basically be saying that no query like this is valid
> (since UPDATE doesn't have a GROUP BY option, there'd be no way to
> pass the grouping check).
> 
> Another way to look at it is that perhaps an UPDATE involving aggregate
> functions ought to be implicitly treated as GROUP BY targetTable.ctid.
> In other words, the MIN() or other aggregate function is implicitly
> evaluated over only those join tuples that are formed for a single
> target tuple.  Intuitively that seems to make sense, and it solves the
> problem you're complaining of, because no matching tuples = no groups =
> no result tuples = update does nothing = no problem.  But I have a
> sneaking suspicion that I'm missing some nasty problem with this idea
> too.
> 
> Comments anyone?  What exactly *should* be the behavior of an UPDATE
> that uses an aggregate function and a join to another table?  Over what
> set of tuples should the aggregate be evaluated?
> 
>                         regards, tom lane


pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Patches coming...
Next
From: Tom Lane
Date:
Subject: Re: Re: [SQL] Re: [GENERAL] lztext and compression ratios...