Re: ALTER TABLE ADD COLUMN fast default - Mailing list pgsql-hackers

From Andres Freund
Subject Re: ALTER TABLE ADD COLUMN fast default
Date
Msg-id 20180329183050.u5trol4gtgtqe77h@alap3.anarazel.de
Whole thread Raw
In response to Re: ALTER TABLE ADD COLUMN fast default  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
Responses Re: ALTER TABLE ADD COLUMN fast default  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: ALTER TABLE ADD COLUMN fast default  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers
Hi,

On 2018-03-29 10:16:23 +1030, Andrew Dunstan wrote:
> On Wed, Mar 28, 2018 at 5:30 PM, Andres Freund <andres@anarazel.de> wrote:
> > Hi,
> >
> > On 2018-03-26 09:02:10 +1030, Andrew Dunstan wrote:
> >> Thanks for this, all looks good. Here is the consolidate patch
> >> rebased. If there are no further comments I propose to commit this in
> >> a few days time.
> >
> > Here's a bit of post commit review:
> >
> > @@ -1310,13 +1679,11 @@ slot_getsomeattrs(TupleTableSlot *slot, int attnum)
> >
> >         /*
> >          * If tuple doesn't have all the atts indicated by tupleDesc, read the
> > -        * rest as null
> > +        * rest as NULLs or missing values
> >          */
> > -       for (; attno < attnum; attno++)
> > -       {
> > -               slot->tts_values[attno] = (Datum) 0;
> > -               slot->tts_isnull[attno] = true;
> > -       }
> > +       if (attno < attnum)
> > +               slot_getmissingattrs(slot, attno, attnum);
> > +
> >         slot->tts_nvalid = attnum;
> >  }
> >
> > It's worthwhile to note that this'll re-process *all* missing values,
> > even if they've already been deformed. I.e. if
> > slot_getmissingattrs(.., first-missing)
> > slot_getmissingattrs(.., first-missing + 1)
> > slot_getmissingattrs(.., first-missing + 2)
> > is called, all three missing values will be copied every time. That's
> > because tts_nvalid isn't taken into account.  I wonder if slot_getmissingattrs
> > could take tts_nvalid into account?
> >
> > I also wonder if this doesn't deserve an unlikely(), to avoid the cost
> > of spilling registers in the hot branch of not missing any values.

> One of us at least is very confused about this function.
> slot_getmissingattrs() gets called at most once by slot_getsomeattrs
> and never for any attribute that's already been deformed.

Imagine the same slot being used in two different expressions. The
typical case for that is e.g. something like
  SELECT a,b,c,d FROM foo WHERE b = 1;

this'll trigger one slot_getsomeattrs(slot, 2) call from within qual
evaluation, and then a slot_getsomeattrs(slot, 4) from within the
projection code.  If you then imagine a tuple where only the first
column exists physically, we'd copy b twice, because attno is only going
to be one 1.  I think we might just want to check tts nvalid in
getmissingattrs?


> > I'm still strongly object to do doing this unconditionally for queries
> > that never need any of this.  We're can end up with a significant number
> > of large tuples in memory here, and then copy that through dozens of
> > tupledesc copies in queries.

> We're just doing the same thing we do for default values.

That's really not a defense. It's hurting us there too.


> None of the tests I did with large numbers of missing values seemed to
> show performance impacts of the kind you describe. Now, none of the
> queries were particularly complex, but the worst case was from
> actually using very large numbers of attributes with missing values,
> where we'd need this data anyway. If we just selected a few attributes
> performance went up rather than down.

Now imagine a partitioned workload with a few thousand partitions over a
few tables. The additional memory is going to start being noticeable.


> pg_attrdef isn't really a good place for it (what if they drop the
> default?). So the only alternative then would be a completely new
> catalog. I'd need a deal of convincing that that was justified.

There's plenty databases with pg_attribute being many gigabytes large,
and this is going to make that even worse.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Proposal: http2 wire format
Next
From: Bruce Momjian
Date:
Subject: Re: Typo in src/backend/access/hash/README