Thread: WIP: CREATE TABLE AS / WITH DATA

WIP: CREATE TABLE AS / WITH DATA

From
Neil Conway
Date:
This patch makes CREATE TABLE AS conform more closely to SQL:2003 by
adding support for the WITH [ NO ] DATA clause (per section 11.3). The
standard says that this clause is mandatory, but I think it should be
optional in PG (partly for backward compatibility, and partly because I
think that is sane behavior). I also added regression tests and updated
the docs. Gavin Sherry helped me solve a parser problem (elaborated on
below) -- much thanks, Gavin.

This is 8.1 material so this patch won't be applied any time soon, but I
thought I'd post it here to get some feedback. Issues/notes on the
patch:

(1) The standard specifies that WITH [ NO ] DATA should follow the
subquery in CREATE TABLE AS. This causes problems for bison: if we make
the modifications to the grammar naively, it results in about 16
shift/reduce conflicts because the "WITH" keyword can appear in various
places in SelectStmt (e.g. WITH TIME ZONE). Gavin fixed that by doing
some manual lookahead in parser.c, like was previously being done for
UNION JOIN. Is this the best solution?

(2) I've modified the parser to transform the SELECT query to have a
"LIMIT 0" clause if WITH NO DATA is specified. This is ugly and
inefficient, but it was the cleanest way I could see to implement it --
we represent CREATE TABLE AS internally as a SelectStmt, and it seemed
cleaner to do this than to add more fields to SelectStmt. Is there a
cleaner way to do this? Perhaps the better solution is to make CREATE
TABLE AS its own Node (as I think has been raised in the past).

(3) A related question is: if the subquery's SelectStmt already has a
LIMIT, should I pfree() the limitCount field before overwriting it? My
guess was that parse trees are sometimes constructed in sufficiently
long-lived memory contexts that the pfree() is wise, but I'm not sure if
that's correct.

(Speaking of which, do we have any docs about when it's necessary to
pfree() each allocation and when it's not? IMHO this can sometimes be
difficult to know...)

(4) I haven't implemented support for WITH [ NO ] DATA in CREATE TABLE
AS / EXECUTE (which is depressingly implemented completely separately
from CREATE TABLE AS / SELECT). I think if we restructure CREATE TABLE
AS to be its own Node, we can clean this up as well.

(5) I called the DATA keyword "DATA_P" because I guessed Win32 might
have issues with it otherwise (see OBJECT_P and so on). I'll double
check whether that's necessary before committing anything -- if anyone
happens to know, please speak up.

Comments?

-Neil

P.S. I'm beginning to think that rather than applying this patch as-is
when we branch for 8.1, it might be a better idea to just bite the
bullet and restructure CREATE TABLE AS as suggested above. Thoughts?

Attachment

Re: WIP: CREATE TABLE AS / WITH DATA

From
Alvaro Herrera
Date:
On Wed, Sep 22, 2004 at 06:48:12PM +1000, Neil Conway wrote:

> P.S. I'm beginning to think that rather than applying this patch as-is
> when we branch for 8.1, it might be a better idea to just bite the
> bullet and restructure CREATE TABLE AS as suggested above. Thoughts?

Could that include supporting SELECT INTO as well as both types of
CREATE TABLE AS?  I remember being annoyed by some limitation of one of
the forms ...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Hoy es el primer día del resto de mi vida"


Re: WIP: CREATE TABLE AS / WITH DATA

From
Neil Conway
Date:
On Wed, 2004-09-22 at 23:00, Alvaro Herrera wrote:
> Could that include supporting SELECT INTO as well as both types of
> CREATE TABLE AS?

Right; my thinking is to have the parser construct SELECT INTO as a
CreateTableAsStmt. That way all the code for creating the "into"
relation is centralized in one place.

-Neil



Re: WIP: CREATE TABLE AS / WITH DATA

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> On Wed, 2004-09-22 at 23:00, Alvaro Herrera wrote:
>> Could that include supporting SELECT INTO as well as both types of
>> CREATE TABLE AS?

> Right; my thinking is to have the parser construct SELECT INTO as a
> CreateTableAsStmt. That way all the code for creating the "into"
> relation is centralized in one place.

Another thing that would be really nice would be to get rid of all the
warty special cases for SELECT INTO in executor/execMain.c.  I have
thought about handling this stuff in a new tuple receiver type (cf
tcop/dest.h, dest.c) but haven't really pursued it.  I also have some
old notes suggesting that we should merge this into INSERT/SELECT,
but that stuff has enough warts of its own that I'm not sure it's a
good path to pursue.

            regards, tom lane

Re: WIP: CREATE TABLE AS / WITH DATA

From
Gavin Sherry
Date:

On Thu, 23 Sep 2004, Tom Lane wrote:

> Neil Conway <neilc@samurai.com> writes:
> > On Wed, 2004-09-22 at 23:00, Alvaro Herrera wrote:
> >> Could that include supporting SELECT INTO as well as both types of
> >> CREATE TABLE AS?
>
> > Right; my thinking is to have the parser construct SELECT INTO as a
> > CreateTableAsStmt. That way all the code for creating the "into"
> > relation is centralized in one place.
>
> Another thing that would be really nice would be to get rid of all the
> warty special cases for SELECT INTO in executor/execMain.c.  I have
> thought about handling this stuff in a new tuple receiver type (cf
> tcop/dest.h, dest.c) but haven't really pursued it.  I also have some

That's a pretty good idea. I was also thinking about this and wondering
about the optimisations we could make. Skip WAL and just fsync() the file
at the end (or dump whole blocks into WAL if we're archiving). Its a bit
of a niche case though.

Gavin

Re: WIP: CREATE TABLE AS / WITH DATA

From
Neil Conway
Date:
On Wed, 2004-09-22 at 23:00, Alvaro Herrera wrote:
> Could that include supporting SELECT INTO as well as both types of
> CREATE TABLE AS?  I remember being annoyed by some limitation of one of
> the forms ...

I'm working on implementing this (and Tom's suggestion of creating a new
DestReceiver), and I should have something worth sending to -patches in
a few days.

One note on the implementation: I had originally wanted to distinguish
between SELECT and SELECT INTO in the parser, so that the SelectStmt
parse node could be kept entirely free of SELECT INTO-related detritus.
I wasn't able to figure out how to do this, though: we want to allow
only the left-most SELECT in a set operation tree to contain an INTO
clause. For now, I've settled for keeping an "into" field in SelectStmt,
and transforming SelectStmt -> CreateTableAsStmt in parser/analyze.c if
appropriate. That means you could send a SelectStmt into the analysis
phase and get a utility statement back, but I don't see that as a
problem. Can anyone see a cleaner way to implement this?

-Neil