Thread: Re: [HACKERS] 8.2 features?

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Joe Conway wrote:
>>>
>>>>> . multiple values clauses for INSERT
>
> The best way might be to fabricate a selectStmt equiv to
> "SELECT <targetlist> UNION ALL SELECT <targetlist>...",
> but that still feels like a hack.

Here is a patch pursuant to my earlier post. It has the advantage of
being fairly simple and noninvasive.

The major downside is that somewhere between 9000 and 10000
VALUES-targetlists produces "ERROR:  stack depth limit exceeded".
Perhaps for the typical use-case this is sufficient though.

I'm open to better ideas, comments, objections...

Thanks,

Joe
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.551
diff -c -r2.551 gram.y
*** src/backend/parser/gram.y    3 Jul 2006 22:45:39 -0000    2.551
--- src/backend/parser/gram.y    18 Jul 2006 04:19:45 -0000
***************
*** 238,251 ****
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_list def_list indirection opt_indirection
!                 group_clause TriggerFuncArgs select_limit
!                 opt_select_limit opclass_item_list
!                 transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause

  %type <range>    into_clause OptTempTableName

  %type <defelt>    createfunc_opt_item common_func_opt_item
--- 238,253 ----
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_els
!                 def_list indirection opt_indirection group_clause
!                 TriggerFuncArgs select_limit opt_select_limit
!                 opclass_item_list transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause

+ %type <node>    insert_target_list insert_target_lists
+
  %type <range>    into_clause OptTempTableName

  %type <defelt>    createfunc_opt_item common_func_opt_item
***************
*** 5349,5360 ****
          ;

  insert_rest:
!             VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     $$->targetList = $3;
!                     $$->selectStmt = NULL;
                  }
              | DEFAULT VALUES
                  {
--- 5351,5370 ----
          ;

  insert_rest:
!             VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     if (((SelectStmt *) $2)->op == SETOP_UNION)
!                     {
!                         $$->targetList = NIL;
!                         $$->selectStmt = $2;
!                     }
!                     else
!                     {
!                         $$->targetList = ((SelectStmt *) $2)->targetList;
!                         $$->selectStmt = NULL;
!                     }
                  }
              | DEFAULT VALUES
                  {
***************
*** 5370,5381 ****
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     $$->targetList = $6;
!                     $$->selectStmt = NULL;
                  }
              | '(' insert_column_list ')' SelectStmt
                  {
--- 5380,5399 ----
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     if (((SelectStmt *) $5)->op == SETOP_UNION)
!                     {
!                         $$->targetList = NIL;
!                         $$->selectStmt = $5;
!                     }
!                     else
!                     {
!                         $$->targetList = ((SelectStmt *) $5)->targetList;
!                         $$->selectStmt = NULL;
!                     }
                  }
              | '(' insert_column_list ')' SelectStmt
                  {
***************
*** 8189,8197 ****

          ;

  insert_target_list:
!             insert_target_el                        { $$ = list_make1($1); }
!             | insert_target_list ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:
--- 8207,8235 ----

          ;

+ insert_target_lists:
+             insert_target_list
+                 {
+                     $$ = $1;
+                 }
+             | insert_target_lists ',' insert_target_list
+                 {
+                     $$ = makeSetOp(SETOP_UNION, TRUE, $1, $3);
+                 }
+         ;
+
  insert_target_list:
!             '(' insert_target_els ')'
!                 {
!                     SelectStmt *n = makeNode(SelectStmt);
!                     n->targetList = $2;
!                     $$ = (Node *) n;
!                 }
!         ;
!
! insert_target_els:
!             insert_target_el                         { $$ = list_make1($1); }
!             | insert_target_els ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:

Re: [HACKERS] 8.2 features?

From
Christopher Kings-Lynne
Date:
> The major downside is that somewhere between 9000 and 10000
> VALUES-targetlists produces "ERROR:  stack depth limit exceeded".
> Perhaps for the typical use-case this is sufficient though.
>
> I'm open to better ideas, comments, objections...

If the use case is people running MySQL dumps, then there will be
millions of values-targetlists in MySQL dumps.

Chris

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Andrew Dunstan wrote:
> Christopher Kings-Lynne wrote:
>
>>> The major downside is that somewhere between 9000 and 10000
>>> VALUES-targetlists produces "ERROR:  stack depth limit exceeded".
>>> Perhaps for the typical use-case this is sufficient though.
>>>
>>> I'm open to better ideas, comments, objections...
>>
>> If the use case is people running MySQL dumps, then there will be
>> millions of values-targetlists in MySQL dumps.
>
> Yeah.  The fabricated select hack does feel wrong to me. Taking a quick
> 2 minute look at the grammar it looks like a better bet would be to make
> InsertStmt.targetList a list of lists of values rather than just a list
> of values. Of course, that would make the changes more invasive. Even
> with that we'd still be reading the whole thing into memory ... is there
> a sane way to cache the inline data before statement execution?

I started down the path of making InsertStmt.targetList a list of
targetlists. The problem is finding a reasonable way to make that
available to the executor. Back to the drawing board I guess.

I have similar concerns with the millions of values-targetlists comment
that Chris made. But I don't see how we can cache the data easily short
of inventing a List alternative that spills to disk.

> I guess we can just say that for true bulk load our supported mechanism
> is still just COPY, but it would be a pity to restrict a feature that is
> in the standard that way.

True

Joe

Re: [HACKERS] 8.2 features?

From
Andrew Dunstan
Date:
Christopher Kings-Lynne wrote:

>> The major downside is that somewhere between 9000 and 10000
>> VALUES-targetlists produces "ERROR:  stack depth limit exceeded".
>> Perhaps for the typical use-case this is sufficient though.
>>
>> I'm open to better ideas, comments, objections...
>
>
> If the use case is people running MySQL dumps, then there will be
> millions of values-targetlists in MySQL dumps.
>
>

Yeah.  The fabricated select hack does feel wrong to me. Taking a quick
2 minute look at the grammar it looks like a better bet would be to make
InsertStmt.targetList a list of lists of values rather than just a list
of values. Of course, that would make the changes more invasive. Even
with that we'd still be reading the whole thing into memory ... is there
a sane way to cache the inline data before statement execution?

I guess we can just say that for true bulk load our supported mechanism
is still just COPY, but it would be a pity to restrict a feature that is
in the standard that way.

cheers

andrew


Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
>> If the use case is people running MySQL dumps, then there will be
>> millions of values-targetlists in MySQL dumps.

I did some experimentation just now, and could not get mysql to accept a
command longer than about 1 million bytes.  It complains about
    Got a packet bigger than 'max_allowed_packet' bytes
which seems a bit odd because max_allowed_packet is allegedly set to
16 million, but anyway I don't think people are going to be loading any
million-row tables using single INSERT commands in mysql either.

            regards, tom lane

Re: [HACKERS] 8.2 features?

From
Chris Browne
Date:
chris.kings-lynne@calorieking.com (Christopher Kings-Lynne) writes:
>> The major downside is that somewhere between 9000 and 10000
>> VALUES-targetlists produces "ERROR:  stack depth limit
>> exceeded". Perhaps for the typical use-case this is sufficient
>> though.
>> I'm open to better ideas, comments, objections...
>
> If the use case is people running MySQL dumps, then there will be
> millions of values-targetlists in MySQL dumps.

Curiosity: How do *does* TheirSQL parse that, and not have the One
Gigantic Query blow up their query parser?
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/unix.html
JOHN CAGE (strapped to table): Do you really expect me to conduct this
 antiquated tonal system?
LEONARD BERNSTEIN: No, Mr. Cage, I expect  you to die!
[With apologies to music and James Bond fans the world over...]

Re: [HACKERS] 8.2 features?

From
Andrew Dunstan
Date:
Chris Browne wrote:

>chris.kings-lynne@calorieking.com (Christopher Kings-Lynne) writes:
>
>
>>>The major downside is that somewhere between 9000 and 10000
>>>VALUES-targetlists produces "ERROR:  stack depth limit
>>>exceeded". Perhaps for the typical use-case this is sufficient
>>>though.
>>>I'm open to better ideas, comments, objections...
>>>
>>>
>>If the use case is people running MySQL dumps, then there will be
>>millions of values-targetlists in MySQL dumps.
>>
>>
>
>Curiosity: How do *does* TheirSQL parse that, and not have the One
>Gigantic Query blow up their query parser?
>
>

Experimentation shows that mysqldump breaks up the insert into chunks.

Example with 10m rows:

[ad@wired-219 ~]# perl -e 'print "drop table if exists foo; create table
foo (x int);\n"; foreach my $i (0..9_9999) { print "insert into foo
values \n"; foreach my $j (0..99) { print "," if $j; print
"(",100*$i+$j+1,")"; } print ";\n"; } ' > gggggg
[ad@wired-219 ~]# mysql test < gggggg
[ad@wired-219 ~]# mysqldump test foo > aaaaaa
[ad@wired-219 ~]# mysql test < aaaaaa
[ad@wired-219 ~]# grep INSERT aaaaaa | wc -l
104


cheers

andrew





Re: [HACKERS] 8.2 features?

From
Christopher Kings-Lynne
Date:
> I did some experimentation just now, and could not get mysql to accept a
> command longer than about 1 million bytes.  It complains about
>     Got a packet bigger than 'max_allowed_packet' bytes
> which seems a bit odd because max_allowed_packet is allegedly set to
> 16 million, but anyway I don't think people are going to be loading any
> million-row tables using single INSERT commands in mysql either.

Strange.  Last time I checked I thought MySQL dump used 'multivalue
lists in inserts' for dumps, for the same reason that we use COPY


Re: [HACKERS] 8.2 features?

From
Christopher Kings-Lynne
Date:
> I did some experimentation just now, and could not get mysql to accept a
> command longer than about 1 million bytes.  It complains about
>     Got a packet bigger than 'max_allowed_packet' bytes
> which seems a bit odd because max_allowed_packet is allegedly set to
> 16 million, but anyway I don't think people are going to be loading any
> million-row tables using single INSERT commands in mysql either.

Ah no, I'm mistaken.  It's not by default in mysqldump, but it does seem
"recommended".  This is from "man mysqldump":

        -e|--extended-insert
               Allows utilization of the new, much faster INSERT syntax.


Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
> Strange.  Last time I checked I thought MySQL dump used 'multivalue
> lists in inserts' for dumps, for the same reason that we use COPY

I think Andrew identified the critical point upthread: they don't try
to put an unlimited number of rows into one INSERT, only a megabyte
or so's worth.  Typical klugy-but-effective mysql design approach ...

            regards, tom lane

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Tom Lane wrote:
> Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
>
>>Strange.  Last time I checked I thought MySQL dump used 'multivalue
>>lists in inserts' for dumps, for the same reason that we use COPY
>
> I think Andrew identified the critical point upthread: they don't try
> to put an unlimited number of rows into one INSERT, only a megabyte
> or so's worth.  Typical klugy-but-effective mysql design approach ...


OK, so given that we don't need to be able to do 1 million
multi-targetlist insert statements, here is rev 2 of the patch.

It is just slightly more invasive, but performs *much* better. In fact,
it can handle as many targetlists as you have memory to deal with. It
also deals with DEFAULT values in the targetlist.

I've attached a php script that I used to do crude testing. Basically I
tested 3 cases in this order:

single-INSERT-multi-statement:
------------------------------
   "INSERT INTO foo2a (f1,f2) VALUES (1,2);"
   -- repeat statement $loopcount times

single-INSERT-at-once:
----------------------
   "INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
   VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
   -- build a single SQL string by looping $loopcount times,
   -- and execute it all at once

multi-INSERT-at-once:
---------------------
   "INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
   -- build a single SQL string by looping $loopcount times,
   -- and execute it all at once

Here are the results:
$loopcount = 100000;
single-INSERT-multi-statement Elapsed time is 34 seconds
single-INSERT-at-once Elapsed time is 7 seconds
multi-INSERT-at-once Elapsed time is 4 seconds
about 370MB peak memory usage

$loopcount = 200000;
single-INSERT-multi-statement Elapsed time is 67 seconds
single-INSERT-at-once Elapsed time is 12 seconds
multi-INSERT-at-once Elapsed time is 9 seconds
about 750MB peak memory usage

$loopcount = 300000;
single-INSERT-multi-statement Elapsed time is 101 seconds
single-INSERT-at-once Elapsed time is 18 seconds
multi-INSERT-at-once Elapsed time is 13 seconds
about 1.1GB  peak memory usage

Somewhere beyond this, my machine goes into swap hell, and I didn't have
the patience to wait for it to complete :-)

It would be interesting to see a side-by-side comparison with MySQL
since that seems to be our benchmark on this feature. I'll try to do
that tomorrow if no one beats me to it.

There is only one downside to the current approach that I'm aware of.
The command-result tag is only set by the "original" query, meaning that
even if you insert 300,000 rows using this method, the command-result
tag looks like "INSERT 0 1"; e.g.:

regression=# create table foo2(f1 int default 42,f2 int default 6);
CREATE TABLE
regression=# insert into foo2 (f1,f2) values
(default,12),(default,10),(115,21);
INSERT 0 1
regression=# select * from foo2;
  f1  | f2
-----+----
   42 | 12
   42 | 10
  115 | 21
(3 rows)

Any thoughts on how to fix that?

Thanks,

Joe


Index: src/backend/parser/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v
retrieving revision 1.340
diff -c -r1.340 analyze.c
*** src/backend/parser/analyze.c    14 Jul 2006 14:52:21 -0000    1.340
--- src/backend/parser/analyze.c    19 Jul 2006 03:53:35 -0000
***************
*** 657,667 ****
      }
      else
      {
          /*
           * For INSERT ... VALUES, transform the given list of values to form a
!          * targetlist for the INSERT.
           */
!         qry->targetList = transformTargetList(pstate, stmt->targetList);
      }

      /*
--- 657,699 ----
      }
      else
      {
+         ListCell   *tlr;
+
          /*
           * For INSERT ... VALUES, transform the given list of values to form a
!          * targetlist for the INSERT. In a multi-targetlist INSERT, append all
!          * but the first targetlist to extras_after to be processed later by
!          * do_parse_analyze
           */
!         qry->targetList = NIL;
!         foreach(tlr, stmt->targetList)
!         {
!             List *tgtlist = (List *) lfirst(tlr);
!
!             if (qry->targetList == NIL)
!             {
!                 /* transform the first targetlist */
!                 qry->targetList = transformTargetList(pstate, tgtlist);
!             }
!             else
!             {
!                 /*
!                  * Create an InsertStmt node for each additional targetlist
!                  * and append to extras_after
!                  */
!                 InsertStmt *insnode = makeNode(InsertStmt);
!
!                 insnode->cols = NIL;
!                 insnode->targetList = list_make1(tgtlist);
!                 insnode->selectStmt = NULL;
!                 insnode->relation = stmt->relation;
!
!                 if (*extras_after == NIL)
!                     *extras_after = list_make1(insnode);
!                 else
!                     *extras_after = lappend(*extras_after, insnode);
!             }
!         }
      }

      /*
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.551
diff -c -r2.551 gram.y
*** src/backend/parser/gram.y    3 Jul 2006 22:45:39 -0000    2.551
--- src/backend/parser/gram.y    19 Jul 2006 03:53:40 -0000
***************
*** 238,247 ****
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_list def_list indirection opt_indirection
!                 group_clause TriggerFuncArgs select_limit
!                 opt_select_limit opclass_item_list
!                 transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause
--- 238,247 ----
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_els insert_target_list insert_target_lists
!                 def_list indirection opt_indirection group_clause
!                 TriggerFuncArgs select_limit opt_select_limit
!                 opclass_item_list transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause
***************
*** 5349,5359 ****
          ;

  insert_rest:
!             VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     $$->targetList = $3;
                      $$->selectStmt = NULL;
                  }
              | DEFAULT VALUES
--- 5349,5359 ----
          ;

  insert_rest:
!             VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     $$->targetList = $2;
                      $$->selectStmt = NULL;
                  }
              | DEFAULT VALUES
***************
*** 5370,5380 ****
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     $$->targetList = $6;
                      $$->selectStmt = NULL;
                  }
              | '(' insert_column_list ')' SelectStmt
--- 5370,5380 ----
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     $$->targetList = $5;
                      $$->selectStmt = NULL;
                  }
              | '(' insert_column_list ')' SelectStmt
***************
*** 8189,8197 ****

          ;

  insert_target_list:
!             insert_target_el                        { $$ = list_make1($1); }
!             | insert_target_list ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:
--- 8189,8215 ----

          ;

+ insert_target_lists:
+             insert_target_list
+                 {
+                     $$ = list_make1($1);
+                 }
+             | insert_target_lists ',' insert_target_list
+                 {
+                     $$ = lappend($1, $3);
+                 }
+         ;
+
  insert_target_list:
!             '(' insert_target_els ')'
!                 {
!                     $$ = $2;
!                 }
!         ;
!
! insert_target_els:
!             insert_target_el                         { $$ = list_make1($1); }
!             | insert_target_els ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:

Attachment
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>
>>I'm liking this too. But when you say "jointree node", are you saying to
>>model the new node type after NestLoop/MergeJoin/HashJoin nodes? These
>>are referred to as "join nodes" in ExecInitNode. Or as you mentioned a
>>couple of times, should this look more like an Append node?
>
>
> No, I guess I confused you by talking about the executor representation
> at the same time.  This is really unrelated to the executor.  The join
> tree I'm thinking of here is the data structure that dangles off
> Query.jointree --- it's a representation of the query's FROM clause,
> and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
> See the last hundred or so lines of primnodes.h for some details.
> The jointree is used by the planner to compute the plan node tree that
> the executor will run, but it's not the same thing.
>
> There are basically two ways you could go about this:
> 1. Make a new jointree leaf node type to represent a VALUES construct,
>    and dangle the list of lists of expressions off that.
> 2. Make a new RangeTblEntry type to represent a VALUES construct, and
>    just put a RangeTblRef to it into the jointree.  The expressions
>    dangle off the RangeTblEntry.
>
> Offhand I'm not certain which of these would be cleanest.  The second
> way has some similarities to the way we handle set operation trees
> (UNION et al), so it might be worth looking at that stuff.  However,
> being a RangeTblEntry has a lot of baggage (eg, various routines expect
> to find an RTE alias, column names, column types, etc) and maybe we
> don't need all that for VALUES.

Since the feature freeze is only about a week off, I wanted to post this
patch even though it is not yet ready to be applied.

Executive summary:
==================
1. The patch is now large and invasive based on adding new node
    types and associated infrastructure. I modelled the nodes largely
    on RangeFunction and FunctionScan.
2. Performance is close enough to mysql to not be a big issue (I think,
    more data below) as long as the machine does not get into a memory
    swapping regime. Memory usage is now better, but not as good as
    mysql.
3. I specifically coded with the intent of preserving current insert
    statement behavior and code paths for current functionality. So there
    *should* be no performance degradation or subtle semantics changes
    for "INSERT DEFAULT VALUES", "INSERT ... VALUES (with one target
    list)", "INSERT ... SELECT ...". Even Tom's recently discovered
    "insert into foo values (tenk1.*)" still works ;-)

Performance:
============
On my development machine (dual core amd64, 2GB RAM) I get the following
results using the php script posted earlier:

Postgres:
---------
$loopcount = 100000;
multi-INSERT-at-once Elapsed time is 1 second

$loopcount = 300000;
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 500000;
multi-INSERT-at-once Elapsed time is 9 seconds

$loopcount = 800000;
multi-INSERT-at-once Elapsed time is 14 seconds

$loopcount = 900000;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 1000000;
multi-INSERT-at-once Elapsed time is 42 seconds

$loopcount = 2000000;
killed after 5 minutes due to swapping

MySQL:
------
$loopcount = 100000;
multi-INSERT-at-once Elapsed time is 2 seconds

$loopcount = 300000;
INSERT failed:Got a packet bigger than 'max_allowed_packet' bytes
changed max_allowed_packet=64M
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 500000;
multi-INSERT-at-once Elapsed time is 8 seconds

$loopcount = 800000;
multi-INSERT-at-once Elapsed time is 13 seconds

$loopcount = 900000;
multi-INSERT-at-once Elapsed time is 15 seconds

$loopcount = 1000000;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 2000000;
multi-INSERT-at-once Elapsed time is 36 seconds

$loopcount = 3000000;
multi-INSERT-at-once Elapsed time is 54 seconds

$loopcount = 4000000;
multi-INSERT-at-once Elapsed time is 134 seconds

<table value constructor>:
==========================
Included in this patch is support for <table value constructor> in the
FROM clause, e.g.:

regression=# select * from {values (1,array[1,2]),(2,array[3,4])};
  ?column? | array
----------+-------
         1 | {1,2}
         2 | {3,4}
(2 rows)

The strange syntax is a temporary hack to eliminate shift/reduce
conflicts. I'm not entirely sure we want to try to support this (or
something like it) for 8.2, but much of what is needed is now readily
available. More on known issues next.

Known Issues:
=============

General:
--------
1. Several comments in the patch are marked "FIXME". These are areas
    where I was uncertain what was the "right thing to do". Any advice
    on these specific spots would be very much appreciated.
2. I broke the rules regression test -- still need to look at what I
    did to mess that up. Somewhere in the reconstruction of "VALUES ..."
    according to the diff.

VALUES multi-targetlist INSERTS:
--------------------------------
3. Not yet quite sure how to get DEFAULT to work for "INSERT ...
    multi-values". As noted above, works fine if there is only
    one targetlist.

<table value constructor>:
--------------------------
4. I'm getting shift/reduce conflicts that are not easily eliminated.
    Making VALUES fully reserved only made it 1 shift/reduce conflict.
5. Column aliases are still not working correctly. Haven't really looked
    closely at this yet.
6. Data types are being deduced currently based on the first row,
    and not currently getting checked on subsequent rows. So it is
    easy to induce a crash:

regression=# select * from {values (1,array[1,2]),(2,3)};
server closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

7. In general, <table value constructor> in the FROM clause needs
    more discussion -- among other things, how should we determine and
    enforce column types? I think this could be a very useful feature,
    but I'm not comfortable I understand it yet.

=================
As usual, review, advise, comments, flames, etc. requested

Joe

Attachment

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Joe Conway wrote:
> Since the feature freeze is only about a week off, I wanted to post this
> patch even though it is not yet ready to be applied.
>

Sorry -- I just realized that two new files for ValuesScan didn't make
it into the patch posted earlier. Here they are now -- please untar in
your postgres sourcetree root in addition to applying the patch.

(I thought "cvs diff -cN" should have included the new files, since I
had earlier done "cvs add" on them, but it didn't work. I could swear
that worked for me in the past...)

Thanks,

Joe

Attachment
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> There are basically two ways you could go about this:
>> 1. Make a new jointree leaf node type to represent a VALUES construct,
>> and dangle the list of lists of expressions off that.
>> 2. Make a new RangeTblEntry type to represent a VALUES construct, and
>> just put a RangeTblRef to it into the jointree.  The expressions
>> dangle off the RangeTblEntry.

You seem to have done *both*, which is certainly not what I had in mind.
I'd drop the RangeTblEntry changes, I think.

Shoving all the tuples into a tuplestore is not doing anything for you
from a performance point of view.  I was thinking more of evaluating the
targetlists on-the-fly.  Basically what I foresaw as the executor
mechanism was something like a Result node, except with a list of
targetlists instead of just one, and part of its runtime state would be
an index saying which one to evaluate next.  (The update logic for the
index would be just like Append's logic for which subplan to eval next.)

Result as it currently stands is a pretty queer beast because it can
have a child plan or not.  I'm tempted to suggest splitting it into
two node types, perhaps call the one with a child "Filter" and reserve
the name "Result" for the one with no child.  The reason for doing this
in this context is that we could just make the no-child case be
multi-targetlist-capable (rather than having separate nearly identical
node types with single and multi tlists).  AFAICS multi tlists don't
make any sense for the filter-a-child-plan scenario, so that's why I
want to push that case off to a different node type.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
>>
>>>There are basically two ways you could go about this:
>>>1. Make a new jointree leaf node type to represent a VALUES construct,
>>>and dangle the list of lists of expressions off that.
>>>2. Make a new RangeTblEntry type to represent a VALUES construct, and
>>>just put a RangeTblRef to it into the jointree.  The expressions
>>>dangle off the RangeTblEntry.
>
> You seem to have done *both*, which is certainly not what I had in mind.
> I'd drop the RangeTblEntry changes, I think.

Good feedback -- thanks! But without the RTE, how would VALUES in the
FROM clause work? Or should I just drop that part and focus on just the
InsertStmt case?

Joe

Joe Conway <mail@joeconway.com> writes:
> Good feedback -- thanks! But without the RTE, how would VALUES in the
> FROM clause work?

Is it different from INSERT?  I'm just imagining a Values node in
the jointree and nothing in the rangetable.

If I'm reading the spec correctly, VALUES is exactly parallel to SELECT
in the grammar, which means that to use it in FROM you would need
parentheses and an alias:

    SELECT ... FROM (SELECT ...) AS foo

    SELECT ... FROM (VALUES ...) AS foo

ISTM that this should be represented using an RTE_SUBQUERY node in the
outer query; the alias attaches to that node, not to the VALUES itself.
So I don't think you need that alias field in the jointree entry either.

If we stick with the plan of representing VALUES as if it were SELECT *
FROM (valuesnode), then this approach would make the second query above
have a structure like

    Query
      .rtable ->    RTE_SUBQUERY
              .subquery ->    Query
                      .jointree ->    Values

(leaving out a ton of detail of course, but those are the key nodes).

To get this to reverse-list in the expected form, we'd need a small
kluge in ruleutils.c that short-circuits the display of "SELECT
... FROM" etc when it sees a Values node at the top of the jointree.
This seems like a fairly small price to pay for keeping Query in
approximately its present form, though.

One thought is that we might allow Query.jointree to point to either
a FromExpr or a Values node, and disallow Values from appearing further
down in the jointree (except perhaps after flattening of subqueries
in the planner).  The alternative is that there's a FromExpr atop
the Values node in the jointree even in the simple case; which seems
uglier but it might avoid breaking some code that expects the top level
to always be FromExpr.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> ISTM that this should be represented using an RTE_SUBQUERY node in the
> outer query; the alias attaches to that node, not to the VALUES itself.
> So I don't think you need that alias field in the jointree entry either.
>
> If we stick with the plan of representing VALUES as if it were SELECT *
> FROM (valuesnode), then this approach would make the second query above
> have a structure like
>
>     Query
>       .rtable ->    RTE_SUBQUERY
>               .subquery ->    Query
>                       .jointree ->    Values
>
> (leaving out a ton of detail of course, but those are the key nodes).
>

OK, I'll go try to wrap my mind around that this evening and see where
it takes me.

Thanks,

Joe

Re: [HACKERS] 8.2 features?

From
Bruce Momjian
Date:
Are you going to apply this?  Seems it is ready.

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
> >
> >>Strange.  Last time I checked I thought MySQL dump used 'multivalue
> >>lists in inserts' for dumps, for the same reason that we use COPY
> >
> > I think Andrew identified the critical point upthread: they don't try
> > to put an unlimited number of rows into one INSERT, only a megabyte
> > or so's worth.  Typical klugy-but-effective mysql design approach ...
>
>
> OK, so given that we don't need to be able to do 1 million
> multi-targetlist insert statements, here is rev 2 of the patch.
>
> It is just slightly more invasive, but performs *much* better. In fact,
> it can handle as many targetlists as you have memory to deal with. It
> also deals with DEFAULT values in the targetlist.
>
> I've attached a php script that I used to do crude testing. Basically I
> tested 3 cases in this order:
>
> single-INSERT-multi-statement:
> ------------------------------
>    "INSERT INTO foo2a (f1,f2) VALUES (1,2);"
>    -- repeat statement $loopcount times
>
> single-INSERT-at-once:
> ----------------------
>    "INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
>    VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
>    -- build a single SQL string by looping $loopcount times,
>    -- and execute it all at once
>
> multi-INSERT-at-once:
> ---------------------
>    "INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
>    -- build a single SQL string by looping $loopcount times,
>    -- and execute it all at once
>
> Here are the results:
> $loopcount = 100000;
> single-INSERT-multi-statement Elapsed time is 34 seconds
> single-INSERT-at-once Elapsed time is 7 seconds
> multi-INSERT-at-once Elapsed time is 4 seconds
> about 370MB peak memory usage
>
> $loopcount = 200000;
> single-INSERT-multi-statement Elapsed time is 67 seconds
> single-INSERT-at-once Elapsed time is 12 seconds
> multi-INSERT-at-once Elapsed time is 9 seconds
> about 750MB peak memory usage
>
> $loopcount = 300000;
> single-INSERT-multi-statement Elapsed time is 101 seconds
> single-INSERT-at-once Elapsed time is 18 seconds
> multi-INSERT-at-once Elapsed time is 13 seconds
> about 1.1GB  peak memory usage
>
> Somewhere beyond this, my machine goes into swap hell, and I didn't have
> the patience to wait for it to complete :-)
>
> It would be interesting to see a side-by-side comparison with MySQL
> since that seems to be our benchmark on this feature. I'll try to do
> that tomorrow if no one beats me to it.
>
> There is only one downside to the current approach that I'm aware of.
> The command-result tag is only set by the "original" query, meaning that
> even if you insert 300,000 rows using this method, the command-result
> tag looks like "INSERT 0 1"; e.g.:
>
> regression=# create table foo2(f1 int default 42,f2 int default 6);
> CREATE TABLE
> regression=# insert into foo2 (f1,f2) values
> (default,12),(default,10),(115,21);
> INSERT 0 1
> regression=# select * from foo2;
>   f1  | f2
> -----+----
>    42 | 12
>    42 | 10
>   115 | 21
> (3 rows)
>
> Any thoughts on how to fix that?
>
> Thanks,
>
> Joe
>
>


[ application/x-php is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Are you going to apply this?  Seems it is ready.

I thought Joe was off in a corner doing a whole new version.
(I'm willing to help if he needs help...)

            regards, tom lane

Re: [HACKERS] 8.2 features?

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Are you going to apply this?  Seems it is ready.
>
> I thought Joe was off in a corner doing a whole new version.
> (I'm willing to help if he needs help...)

OK, just checking.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>
>>Are you going to apply this?  Seems it is ready.
>
> I thought Joe was off in a corner doing a whole new version.
> (I'm willing to help if he needs help...)
>

Yeah, I was going to post the latest tonight.

I'm afraid though that after 2 or so days heading down the last path you
suggested (namely making a new jointree leaf node) I was having trouble,
and at the same time came to the conclusion that adding a new RTE was
alot cleaner and made more sense to me. So I'm hoping you won't want to
send me back to the drawing board again. I believe I have cleaned up the
things you objected to:

1. Now I'm not doing both alternative -- the targetlists are only
    attached to the RTE from the point of parse analysis onward.
2. I've eliminated the tuplestore in favor of runtime evaluation
    of the targetlists which are in an array (allowing forward or
    backward scanning -- although I haven't tested the latter yet).

I've also solved the INSERT related issues that I had earlier:

1. Fixed the rules regression test -- now all regression tests pass
2. Fixed evaluation of DEFAULT values
3. Improved memory consumption and speed some more -- basically
    we are approximately equal to mysql as long as we don't swap,
    and we consume about twice the RAM as mysql instead of several
    times as much. I have more analysis of memory use I'd also like
    to share later.
4. I think the INSERT part of this is ready to go basically, but
    I need a bit more time to test corner cases.

I've made some progress on "SELECT ... FROM (VALUES ...) AS ..."

1. No more shift/reduce issues
2. The ValuesScan work and memory improvements mentioned above
    applies here too.
3. This part still needs the most work though.

I'll post a patch in a few hours -- there is some debug code in there
currently that I should clean up before I send it to the list.

BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from
my day job) to work on outstanding issues. I can continue to work
through the end of next Friday, 4 August. After that I'm heading to
Germany on a business trip and my "spare" time will evaporate for a few
weeks.

Joe



Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Joe Conway wrote:
> Tom Lane wrote:
>> I thought Joe was off in a corner doing a whole new version.
>> (I'm willing to help if he needs help...)
>
> Yeah, I was going to post the latest tonight.

Sorry for the delay. Ever see the movie "The Money Pit"? This afternoon
I started to think I lived in that house :-(

Anyway, as mentioned below, I think the attached works well for the
"INSERT ... VALUES (...), (...), ..." and related cases. There are still
things wrong that I have not even tried to fix with respect to FROM
clause VALUES lists. Namely column aliases have no effect, and neither
does "ORDER BY" clause (I'm pretty sure addRangeTableEntryForValues
needs work among other places).

 From a memory usage standpoint, I got the following using 1,000,000
values targetlists:

sql length = 6000032

NOTICE:  enter transformInsertStmt
MessageContext: 478142520 total in 66 blocks; 5750400 free (3 chunks);
472392120 used

NOTICE:  enter transformRangeValues
MessageContext: 478142520 total in 66 blocks; 5749480 free (6 chunks);
472393040 used

NOTICE:  enter updateTargetListEntry
MessageContext: 629137464 total in 84 blocks; 44742464 free (999991
chunks); 584395000 used

NOTICE:  exit transformInsertStmt
MessageContext: 629137464 total in 84 blocks; 44742408 free (999991
chunks); 584395056 used

NOTICE:  start ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used

NOTICE:  end ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used
ExecutorState: 8024632 total in 3 blocks; 21256 free (8 chunks); 8003376
used

This shows original SQL statement is about 6MB, by the time we get to
parse analysis we're at almost 500 MB, and that memory is never
recovered. Transforming from ResTarget to TargetEntry chews up about
100MB. Then between exiting transformInsertStmt and entering
ExecInitValuesScan we double in memory usage to about 1 GB. It isn't
shown here, but we add another 200 MB or so during tuple projection. So
we top out at about 1.2 GB. Note that mysql tops out at about 600 MB for
this same SQL.

I'm not sure what if anything can be done to improve the above -- I'm
open to suggestions.

Please note that this patch requires an initdb, although I have not yet
bothered to bump CATVERSION.

Thanks for help, comments, suggestions, etc...

Joe


>
> I'm afraid though that after 2 or so days heading down the last path you
> suggested (namely making a new jointree leaf node) I was having trouble,
> and at the same time came to the conclusion that adding a new RTE was
> alot cleaner and made more sense to me. So I'm hoping you won't want to
> send me back to the drawing board again. I believe I have cleaned up the
> things you objected to:
>
> 1. Now I'm not doing both alternative -- the targetlists are only
>    attached to the RTE from the point of parse analysis onward.
> 2. I've eliminated the tuplestore in favor of runtime evaluation
>    of the targetlists which are in an array (allowing forward or
>    backward scanning -- although I haven't tested the latter yet).
>
> I've also solved the INSERT related issues that I had earlier:
>
> 1. Fixed the rules regression test -- now all regression tests pass
> 2. Fixed evaluation of DEFAULT values
> 3. Improved memory consumption and speed some more -- basically
>    we are approximately equal to mysql as long as we don't swap,
>    and we consume about twice the RAM as mysql instead of several
>    times as much. I have more analysis of memory use I'd also like
>    to share later.
> 4. I think the INSERT part of this is ready to go basically, but
>    I need a bit more time to test corner cases.
>
> I've made some progress on "SELECT ... FROM (VALUES ...) AS ..."
>
> 1. No more shift/reduce issues
> 2. The ValuesScan work and memory improvements mentioned above
>    applies here too.
> 3. This part still needs the most work though.
>
> I'll post a patch in a few hours -- there is some debug code in there
> currently that I should clean up before I send it to the list.
>
> BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from
> my day job) to work on outstanding issues. I can continue to work
> through the end of next Friday, 4 August. After that I'm heading to
> Germany on a business trip and my "spare" time will evaporate for a few
> weeks.
>

Attachment

Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
>> I'm afraid though that after 2 or so days heading down the last path you
>> suggested (namely making a new jointree leaf node) I was having trouble,
>> and at the same time came to the conclusion that adding a new RTE was
>> alot cleaner and made more sense to me. So I'm hoping you won't want to
>> send me back to the drawing board again. I believe I have cleaned up the
>> things you objected to:

I was just objecting to having both a new RTE type and a new jointree
node type --- you only need one or the other.  Opting for the new RTE
type is fine with me, and it probably is a bit cleaner at the end of
the day.

I still dislike the way you're doing things in the executor though.
I don't see the point of using the execScan.c machinery; most of the
time that'll be useless overhead.  As I said before, I think the right
direction here is to split Result into two single-purpose node types
and make the non-filter version capable of taking a list of targetlists.

As far as reducing memory use goes, it seems to me that there's no need
for the individual "targetlists" to have ResTarget/TargetEntry
decoration.  For the simple case where the expressions are just Const
nodes, this could save something like a third of the space (there's also
a List node per item, which we can't do much about).  I think we'd have
to gin up a fake targetlist to attach to the Plan node, but there'd be
only one.

Since the result-node split is my hot button, I'm willing to volunteer
to make it happen.  Do you want to concentrate on the remaining
parser-area issues and leave the executor part to me?

            regards, tom lane

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>
>>>I'm afraid though that after 2 or so days heading down the last path you
>>>suggested (namely making a new jointree leaf node) I was having trouble,
>>>and at the same time came to the conclusion that adding a new RTE was
>>>alot cleaner and made more sense to me. So I'm hoping you won't want to
>>>send me back to the drawing board again. I believe I have cleaned up the
>>>things you objected to:
>
>
> I was just objecting to having both a new RTE type and a new jointree
> node type --- you only need one or the other.  Opting for the new RTE
> type is fine with me, and it probably is a bit cleaner at the end of
> the day.

Great!

> I still dislike the way you're doing things in the executor though.
> I don't see the point of using the execScan.c machinery; most of the
> time that'll be useless overhead.  As I said before, I think the right
> direction here is to split Result into two single-purpose node types
> and make the non-filter version capable of taking a list of targetlists.

OK.

> As far as reducing memory use goes, it seems to me that there's no need
> for the individual "targetlists" to have ResTarget/TargetEntry
> decoration.  For the simple case where the expressions are just Const
> nodes, this could save something like a third of the space (there's also
> a List node per item, which we can't do much about).  I think we'd have
> to gin up a fake targetlist to attach to the Plan node, but there'd be
> only one.

OK, I'll take a look at that (actually I was just in that general
vicinity anyway).

> Since the result-node split is my hot button, I'm willing to volunteer
> to make it happen.  Do you want to concentrate on the remaining
> parser-area issues and leave the executor part to me?
>

Sure, sounds good to me.

Joe

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Tom Lane wrote:
> As far as avoiding overhead goes, here's what I'm thinking:
>
> * The Values RTE node should contain a list of lists of bare
> expressions, without TargetEntry decoration (you probably do not
> need ResTarget in the raw parse tree for VALUES, either).
>
> * The ValuesScan plan node will just reference this list-of-lists
> (avoiding making a copy).  It will need to contain a targetlist
> because all plan nodes do, but the base version of that will just
> be a trivial "Var 1", "Var 2", etc.  (The planner might replace that
> with a nontrivial targetlist in cases such as the example above.)

I wanted to post an updated patch even though there are still things not
working again after conversion to bare expressions. Note that I hacked
enough of the executor stuff so I could test my changes on the parser
area. The basic "INSERT ... VALUES (...), (...), ..." does work, but
without DEFAULT again :-(.

The good news is that from a memory and perfomance standpoint, my simple
test now shows us outperforming mysql:

$loopcount = 1000000;
Postgres:
   multi-INSERT-at-once Elapsed time is 12 seconds
   ~420MB
MySQL:
   multi-INSERT-at-once Elapsed time is 17 seconds
   ~600MB

$loopcount = 2000000;
Postgres:
   multi-INSERT-at-once Elapsed time is 29 seconds
   ~730MB
MySQL:
   multi-INSERT-at-once Elapsed time is 37 seconds
   ~1.2GB (this one is from memory -- I didn't write it in my notes)

Joe

Attachment

Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> The good news is that from a memory and perfomance standpoint, my simple
> test now shows us outperforming mysql:

Sweet ;-)

I'm up to my *ss in fixing relation locking, but will get back to your
thing as soon as that's done.  I think you're close enough to qualify
as having made the feature freeze deadline, in any case.

            regards, tom lane

Re: [HACKERS] 8.2 features?

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> > The good news is that from a memory and perfomance standpoint, my simple
> > test now shows us outperforming mysql:
>
> Sweet ;-)

I love this team.  Kudos!

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: [HACKERS] 8.2 features?

From
Michael Fuhr
Date:
On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Joe Conway <mail@joeconway.com> writes:
> > > The good news is that from a memory and perfomance standpoint, my simple
> > > test now shows us outperforming mysql:
> >
> > Sweet ;-)
>
> I love this team.  Kudos!

So now it's MySQL users' turn to say, "Sure, but speed isn't
everything...." :-)

--
Michael Fuhr

Re: [HACKERS] 8.2 features?

From
"Joshua D. Drake"
Date:
Michael Fuhr wrote:
> On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote:
>> Tom Lane wrote:
>>> Joe Conway <mail@joeconway.com> writes:
>>>> The good news is that from a memory and perfomance standpoint, my simple
>>>> test now shows us outperforming mysql:
>>> Sweet ;-)
>> I love this team.  Kudos!
>
> So now it's MySQL users' turn to say, "Sure, but speed isn't
> everything...." :-)

"Sure, but speed isn't everything... We can accept 02/31/2006 as a valid
date. Let's see PostgreSQL do that!"

Joshua D. Drake

>


--

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/



Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> I wanted to post an updated patch even though there are still things not
> working again after conversion to bare expressions.

I've been through the planner part of this and it looks OK (one or two
small errors).  I'm currently messing with a revised version of the
grammar that supports putting VALUES everyplace that the spec allows,
and is a bit simpler than the old one to boot: it folds VALUES and
SELECT together, so we need fewer cases in the INSERT production.
Of course this breaks most of what you did in the parser :-( ...
I'm working on fixing that.

I'm about to go out to dinner but thought I'd post the gram.y and
parsenodes.h files so you could see where I'm headed.  These are
diffs from CVS tip, not from your patch.

            regards, tom lane


Attachment

Re: [HACKERS] 8.2 features?

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>
>>I wanted to post an updated patch even though there are still things not
>>working again after conversion to bare expressions.
>
> I've been through the planner part of this and it looks OK (one or two
> small errors).  I'm currently messing with a revised version of the
> grammar that supports putting VALUES everyplace that the spec allows,
> and is a bit simpler than the old one to boot: it folds VALUES and
> SELECT together, so we need fewer cases in the INSERT production.
> Of course this breaks most of what you did in the parser :-( ...
> I'm working on fixing that.
>
> I'm about to go out to dinner but thought I'd post the gram.y and
> parsenodes.h files so you could see where I'm headed.  These are
> diffs from CVS tip, not from your patch.
>

Yup, I can see where you're headed. Looks nice!

In case you can make use of it, here's my latest. I found that I was
being too aggressive at freeing the input nodes to transformExpr() in
transformRangeValues() after using them. In many cases the returned node
is a new palloc'd node, but in some cases it is not.

The other issue I found was that I had neglected to fixup/coerce the raw
expressions ala updateTargetListEntry(). I ended up creating a somewhat
simpler updateValuesExprListEntry() to use on values expression lists.

I have yet to get to the similar/more general issue of coercing values
expression lists to common datatypes (i.e. using select_common_type()).

FWIW, here's a list of non-working cases at the moment:

8<-------------------------------------
create table inserttest (col1 int4, col2 int4 NOT NULL, col3 text
default 'testing');

--doesn't work
---------------
--wrong result
insert into inserttest (col2, col3) values (23, DEFAULT), (24, DEFAULT),
(25, 'hello'), (26, DEFAULT);
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) using (f1);
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) using (f1) where t2.f2 = 8;
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) on t1.f1 = t2.f2 where t1.f1 = 3;

--corrupt result but no crash
select f1,f2 from (values (11,2),(26,'a'),(6,4)) as t(f1,f2) order by 1
desc;

--crash
select f1 from (values (1,2),(2,3)) as t(f1,f2) order by 1 desc;
select f1,f2 from (values (11,'a'),(26,13),(6,'c')) as t(f1,f2) order by
1 desc;
8<-------------------------------------

Joe

Attachment

Re: [HACKERS] 8.2 features?

From
"Harald Armin Massa"
Date:
Joshua,

> So now it's MySQL users' turn to say, "Sure, but speed isn't
> everything...." :-)
"Sure, but speed isn't everything... We can accept 02/31/2006 as a valid
date. Let's see PostgreSQL do that!"

I got the joke :)

But: it is still a problem when converting. As accepting 2006-02-31 as a valid date would require brainwashing at least the entire core team, we should find a "recommended path of date migration from different universes".

My idea would be to:

a) declare date fields as text
b) load the dump of the other db
c) add another column for the date fields, type timestampe (w/wo tz)
d) try to update the column of c) with the converted field from a)
e) replace the failing ones manually

is this really best practice? especially finding the invalid ones would be challenging :(
idea: sort after the textual date fields; look at hot spots (0000-00-00, xxxx-02-31)

Are there better ideas? shall we document the best practice somewhere?

Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.

Re: [HACKERS] 8.2 features?

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> In case you can make use of it, here's my latest. I found that I was
> being too aggressive at freeing the input nodes to transformExpr() in
> transformRangeValues() after using them. In many cases the returned node
> is a new palloc'd node, but in some cases it is not.

Great, I'll incorporate these updates and keep plugging --- should be
done today barring problems.  If you have some spare cycles today,
want to work on regression tests and docs?

            regards, tom lane

Here's what I've got so far.  I think there's probably more gold to be
mined in terms of reducing runtime memory consumption (I don't like the
list_free_deep bit, we should use a context), but functionally it seems
complete.  I'm off to dinner again, it's in your court to look over some
more if you want.

(PS: if you want to apply, go ahead, don't forget catversion bump.)

            regards, tom lane


Attachment

Re: Values list-of-targetlists patch for comments (was Re:

From
Gavin Sherry
Date:
Tom,

Is this intentional:

template1=# values(1), (2);
 column1
---------
       1
       2
(2 rows)

This is legal because of:

simple_select:
        /* ... */
            | values_clause                         { $$ = $2; }

Also, I am working out some docs and regression tests.

Gavin


Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> Here's what I've got so far.  I think there's probably more gold to be
> mined in terms of reducing runtime memory consumption (I don't like the
> list_free_deep bit, we should use a context), but functionally it seems
> complete.  I'm off to dinner again, it's in your court to look over some
> more if you want.

OK, I'll continue to look at it this week.

> (PS: if you want to apply, go ahead, don't forget catversion bump.)
>

Sure, I'll commit shortly.

Thanks,

Joe


Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Gavin Sherry wrote:
> Is this intentional:
>
> template1=# values(1), (2);
>  column1
> ---------
>        1
>        2
> (2 rows)
>
> This is legal because of:
>
> simple_select:
>         /* ... */
>             | values_clause                         { $$ = $2; }

hmm, not sure about that...

>
> Also, I am working out some docs and regression tests.
>

Oh, cool. I was going to start working on that myself tonight, but if
you're already working on it, don't let me stand in the way ;-)

Actually, if you want me to finish up whatever you have started, I'm
happy to do that too.

Joe


Re: Values list-of-targetlists patch for comments (was Re:

From
Gavin Sherry
Date:
On Tue, 1 Aug 2006, Joe Conway wrote:

> Gavin Sherry wrote:
> > Is this intentional:
> >
> > template1=# values(1), (2);
> >  column1
> > ---------
> >        1
> >        2
> > (2 rows)
> >
> > This is legal because of:
> >
> > simple_select:
> >         /* ... */
> >             | values_clause                         { $$ = $2; }
>
> hmm, not sure about that...
>
> >
> > Also, I am working out some docs and regression tests.
> >
>
> Oh, cool. I was going to start working on that myself tonight, but if
> you're already working on it, don't let me stand in the way ;-)
>
> Actually, if you want me to finish up whatever you have started, I'm
> happy to do that too.

I've got to go out but I'll send a complete patch when I get back.

Gavin

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> Here's what I've got so far.  I think there's probably more gold to be
> mined in terms of reducing runtime memory consumption (I don't like the
> list_free_deep bit, we should use a context), but functionally it seems
> complete.  I'm off to dinner again, it's in your court to look over some
> more if you want.
>
> (PS: if you want to apply, go ahead, don't forget catversion bump.)

Committed, with catversion bump.

Joe


Gavin Sherry <swm@linuxworld.com.au> writes:
> Is this intentional:

> template1=# values(1), (2);
>  column1
> ---------
>        1
>        2
> (2 rows)

You bet.  VALUES is parallel to SELECT in the SQL grammar, so AFAICS
it should be legal anywhere you can write SELECT.

The basic productions in the spec's grammar are respectively

         <query specification> ::=
              SELECT [ <set quantifier> ] <select list>
                <table expression>

and

         <table value constructor> ::=
              VALUES <row value expression list>

and both of them link into the rest of the grammar here:

         <simple table> ::=
                <query specification>
              | <table value constructor>
              | <explicit table>

There is no construct I can find in the spec grammar that allows
<query specification> but not <table value constructor>.  QED.

Try some stuff like
    DECLARE c CURSOR FOR VALUES ...
    WHERE foo IN (VALUES ...


            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> Here's what I've got so far.  I think there's probably more gold to be
> mined in terms of reducing runtime memory consumption (I don't like the
> list_free_deep bit, we should use a context), but functionally it seems
> complete.

I checked out memory usage, and it had regressed to about 1.4 GB (from
730 MB as reported yesterday) for 2 million inserts of 2 integers (i.e.
with the php script I've been using).

I know you're not too happy with the attached approach to solving this,
but I'm not sure how creating a memory context is going to help. Part of
the problem is that the various transformXXX functions sometimes return
freshly palloc'd memory, and sometimes return the pointer they are given.

Anyway, with the attached diff, the 2 million inserts case is back to
about 730 MB memory use, and speed is pretty much the same as reported
yesterday (i.e both memory use and performance better than mysql with
innodb tables).

Thoughts?

Thanks,

Joe
Index: src/backend/parser/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v
retrieving revision 1.341
diff -c -r1.341 analyze.c
*** src/backend/parser/analyze.c    2 Aug 2006 01:59:46 -0000    1.341
--- src/backend/parser/analyze.c    2 Aug 2006 05:13:20 -0000
***************
*** 872,877 ****
--- 872,878 ----
      foreach(lc, exprlist)
      {
          Expr *expr = (Expr *) lfirst(lc);
+         Expr *p = expr;
          ResTarget  *col;

          col = (ResTarget *) lfirst(icols);
***************
*** 885,893 ****
--- 886,898 ----

          result = lappend(result, expr);

+         if (expr != p)
+             pfree(p);
+
          icols = lnext(icols);
          attnos = lnext(attnos);
      }
+     list_free(exprlist);

      return result;
  }
***************
*** 2191,2196 ****
--- 2196,2202 ----
      for (i = 0; i < sublist_length; i++)
      {
          coltypes[i] = select_common_type(coltype_lists[i], "VALUES");
+         list_free(coltype_lists[i]);
      }

      newExprsLists = NIL;
***************
*** 2203,2216 ****
          foreach(lc2, sublist)
          {
              Node  *col = (Node *) lfirst(lc2);

-             col = coerce_to_common_type(pstate, col, coltypes[i], "VALUES");
-             newsublist = lappend(newsublist, col);
              i++;
          }

          newExprsLists = lappend(newExprsLists, newsublist);
      }

      /*
       * Generate the VALUES RTE
--- 2209,2228 ----
          foreach(lc2, sublist)
          {
              Node  *col = (Node *) lfirst(lc2);
+             Node  *new_col;
+
+             new_col = coerce_to_common_type(pstate, col, coltypes[i], "VALUES");
+             newsublist = lappend(newsublist, new_col);
+             if (new_col != col)
+                 pfree(col);

              i++;
          }

          newExprsLists = lappend(newExprsLists, newsublist);
+         list_free(sublist);
      }
+     list_free(exprsLists);

      /*
       * Generate the VALUES RTE
Index: src/backend/parser/parse_target.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/parse_target.c,v
retrieving revision 1.147
diff -c -r1.147 parse_target.c
*** src/backend/parser/parse_target.c    2 Aug 2006 01:59:47 -0000    1.147
--- src/backend/parser/parse_target.c    2 Aug 2006 05:13:21 -0000
***************
*** 172,177 ****
--- 172,178 ----
      foreach(lc, exprlist)
      {
          Node       *e = (Node *) lfirst(lc);
+         Node       *p = e;

          /*
           * Check for "something.*".  Depending on the complexity of the
***************
*** 188,193 ****
--- 189,195 ----
                  result = list_concat(result,
                                       ExpandColumnRefStar(pstate, cref,
                                                           false));
+                 pfree(e);
                  continue;
              }
          }
***************
*** 203,208 ****
--- 205,211 ----
                  result = list_concat(result,
                                       ExpandIndirectionStar(pstate, ind,
                                                             false));
+                 pfree(e);
                  continue;
              }
          }
***************
*** 210,218 ****
          /*
           * Not "something.*", so transform as a single expression
           */
!         result = lappend(result,
!                          transformExpr(pstate, e));
      }

      return result;
  }
--- 213,224 ----
          /*
           * Not "something.*", so transform as a single expression
           */
!         p = transformExpr(pstate, e);
!         result = lappend(result, p);
!         if (e != p)
!             pfree(e);
      }
+     list_free(exprlist);

      return result;
  }

Re: Values list-of-targetlists patch for comments (was Re:

From
"Joshua D. Drake"
Date:
>
> Anyway, with the attached diff, the 2 million inserts case is back to
> about 730 MB memory use, and speed is pretty much the same as reported
> yesterday (i.e both memory use and performance better than mysql with
> innodb tables).

That's all that matters ;)

Joshua D. Drake

--

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/



Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Joe Conway wrote:
> Tom Lane wrote:
>
>> Here's what I've got so far.  I think there's probably more gold to be
>> mined in terms of reducing runtime memory consumption (I don't like the
>> list_free_deep bit, we should use a context), but functionally it seems
>> complete.
>
> I checked out memory usage, and it had regressed to about 1.4 GB (from
> 730 MB as reported yesterday) for 2 million inserts of 2 integers (i.e.
> with the php script I've been using).
>
> I know you're not too happy with the attached approach to solving this,
> but I'm not sure how creating a memory context is going to help. Part of
> the problem is that the various transformXXX functions sometimes return
> freshly palloc'd memory, and sometimes return the pointer they are given.
>
> Anyway, with the attached diff, the 2 million inserts case is back to
> about 730 MB memory use, and speed is pretty much the same as reported
> yesterday (i.e both memory use and performance better than mysql with
> innodb tables).

Of course it also breaks a bunch of regression tests -- I guess that
just points to the fragility of this approach.

This patch retains the memory consumption savings but doesn't break any
regression tests...

Joe
? src/test/regress/sql/insert.sql.new
Index: src/backend/parser/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v
retrieving revision 1.341
diff -c -r1.341 analyze.c
*** src/backend/parser/analyze.c    2 Aug 2006 01:59:46 -0000    1.341
--- src/backend/parser/analyze.c    2 Aug 2006 05:48:18 -0000
***************
*** 888,893 ****
--- 888,894 ----
          icols = lnext(icols);
          attnos = lnext(attnos);
      }
+     list_free(exprlist);

      return result;
  }
***************
*** 2191,2196 ****
--- 2192,2198 ----
      for (i = 0; i < sublist_length; i++)
      {
          coltypes[i] = select_common_type(coltype_lists[i], "VALUES");
+         list_free(coltype_lists[i]);
      }

      newExprsLists = NIL;
***************
*** 2203,2216 ****
          foreach(lc2, sublist)
          {
              Node  *col = (Node *) lfirst(lc2);

-             col = coerce_to_common_type(pstate, col, coltypes[i], "VALUES");
-             newsublist = lappend(newsublist, col);
              i++;
          }

          newExprsLists = lappend(newExprsLists, newsublist);
      }

      /*
       * Generate the VALUES RTE
--- 2205,2224 ----
          foreach(lc2, sublist)
          {
              Node  *col = (Node *) lfirst(lc2);
+             Node  *new_col;
+
+             new_col = coerce_to_common_type(pstate, col, coltypes[i], "VALUES");
+             newsublist = lappend(newsublist, new_col);
+             if (new_col != col)
+                 pfree(col);

              i++;
          }

          newExprsLists = lappend(newExprsLists, newsublist);
+         list_free(sublist);
      }
+     list_free(exprsLists);

      /*
       * Generate the VALUES RTE
Index: src/backend/parser/parse_target.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/parse_target.c,v
retrieving revision 1.147
diff -c -r1.147 parse_target.c
*** src/backend/parser/parse_target.c    2 Aug 2006 01:59:47 -0000    1.147
--- src/backend/parser/parse_target.c    2 Aug 2006 05:48:18 -0000
***************
*** 172,177 ****
--- 172,178 ----
      foreach(lc, exprlist)
      {
          Node       *e = (Node *) lfirst(lc);
+         Node       *p = e;

          /*
           * Check for "something.*".  Depending on the complexity of the
***************
*** 188,193 ****
--- 189,195 ----
                  result = list_concat(result,
                                       ExpandColumnRefStar(pstate, cref,
                                                           false));
+                 pfree(e);
                  continue;
              }
          }
***************
*** 203,208 ****
--- 205,211 ----
                  result = list_concat(result,
                                       ExpandIndirectionStar(pstate, ind,
                                                             false));
+                 pfree(e);
                  continue;
              }
          }
***************
*** 210,218 ****
          /*
           * Not "something.*", so transform as a single expression
           */
!         result = lappend(result,
!                          transformExpr(pstate, e));
      }

      return result;
  }
--- 213,224 ----
          /*
           * Not "something.*", so transform as a single expression
           */
!         p = transformExpr(pstate, e);
!         result = lappend(result, p);
!         if (e != p)
!             pfree(e);
      }
+     list_free(exprlist);

      return result;
  }

Re: Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> This patch retains the memory consumption savings but doesn't break any
> regression tests...

I'm unconvinced that retail pfree's are the way to go.  I just did some
profiling of this test case:

    insert into foo values
    (0,0,0),
    (1,1,1),
    (2,2,2),
    (3,3,3),
    ... one million rows ...
    (999997,999997,999997),
    (999998,999998,999998),
    (999999,999999,999999);

using CVS tip, and got these oprofile results:

samples  %        symbol name
39742    10.1656  base_yyparse
38338     9.8065  XLogInsert
28247     7.2253  AllocSetAlloc
20490     5.2411  expression_tree_walker
16822     4.3029  ExecInitExpr
16469     4.2126  base_yylex
14789     3.7829  PageAddItem
11174     2.8582  LWLockAcquire
11167     2.8564  LWLockRelease
9195      2.3520  RewriteQuery
9120      2.3328  AllocSetFree
7788      1.9921  ExecInitValuesScan
7596      1.9430  ExecEvalConst
7586      1.9404  lappend
6860      1.7547  ValuesNext
6261      1.6015  heap_fill_tuple
6141      1.5708  MemoryContextAllocZeroAligned
5619      1.4373  fix_expr_references_walker
5613      1.4357  transformExpressionList
5269      1.3478  heap_insert
5177      1.3242  contain_vars_of_level_walker
4601      1.1769  heap_form_tuple
4345      1.1114  ExecutorRun
4299      1.0996  hash_any
4201      1.0746  MemoryContextAlloc
4061      1.0388  check_stack_depth

It's slightly depressing that there's not more time being spent in
places we can easily tweak, but anyway the salient point to me is
that AllocSetFree is already chewing a nontrivial part of the runtime.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>
>>This patch retains the memory consumption savings but doesn't break any
>>regression tests...
>
>
> I'm unconvinced that retail pfree's are the way to go.  I just did some
> profiling of this test case:

<snip>

> It's slightly depressing that there's not more time being spent in
> places we can easily tweak, but anyway the salient point to me is
> that AllocSetFree is already chewing a nontrivial part of the runtime.

That's undoubtedly true, and important for the case that isn't memory
constrained (but where I'm already seeing us perform relatively well).
But once we start the machine swapping, runtime goes in the toilet. And
without addressing the memory leak somehow, we will start a machine
swapping significantly earlier than mysql.

Joe

Re: Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> That's undoubtedly true, and important for the case that isn't memory
> constrained (but where I'm already seeing us perform relatively well).
> But once we start the machine swapping, runtime goes in the toilet. And
> without addressing the memory leak somehow, we will start a machine
> swapping significantly earlier than mysql.

I'm not arguing that we don't need to work on the memory usage ... just
that I'm not very happy with that particular approach.

I wonder whether there is any reasonable way to determine which data
structures are responsible for how much space ... in my test I'm seeing

MessageContext: 822075440 total in 104 blocks; 4510280 free (1 chunks); 817565160 used
ExecutorState: 8024624 total in 3 blocks; 20592 free (12 chunks); 8004032 used

so it seems mostly not the executor's fault, but that's not much to go
on.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
>
> I wonder whether there is any reasonable way to determine which data
> structures are responsible for how much space ... in my test I'm seeing
>
> MessageContext: 822075440 total in 104 blocks; 4510280 free (1 chunks); 817565160 used
> ExecutorState: 8024624 total in 3 blocks; 20592 free (12 chunks); 8004032 used
>
> so it seems mostly not the executor's fault, but that's not much to go
> on.

I was doing it by sprinkling MemoryContextStats() in various places.
I'll spend some time again later today and see if I can narrow it down
to specific data structures using that. It shouldn't be too hard -- the
patch I sent last night only pfrees a few structures, and they represent
the bulk of what we need to clean up.

Joe

Re: Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> I wonder whether there is any reasonable way to determine which data
>> structures are responsible for how much space ... in my test I'm seeing

> I was doing it by sprinkling MemoryContextStats() in various places.

Yeah, I've just been doing that and some hand analysis too.  What I get
(on a 64-bit machine) is that essentially all the space goes into

lists of A_Const lists:        320000000
lists of Const lists:        320000000
transformInsertRow extra lists: 144000000

I think we could safely list_free the input list in transformInsertRow
as your patch suggests, which would buy back the 144M part.  But I don't
believe it's safe at all to free the raw_parser output --- the grammar
sometimes makes multiple links to the same subtree, eg in BETWEEN.
In any case the patch as proposed wouldn't catch all the detritus for
any case more complicated than a simple integer constant.

The way that the list memory usage works (again, 64-bit machine) is

sizeof(List) = 24
sizeof(ListCell) = 16
sizeof(A_Const) = 32

Each of these nodes will have 16 bytes palloc overhead, and the List
header will be rounded up to 32 bytes as well, so we have total space
for a 3-element integer list of
    32+16 + (16+16 + 32+16) * 3
Add in 16+16 for the associated ListCell of the top list-of-lists,
and you come to 320 bytes per sublist.  Const happens to also be
32 bytes so the transformed lists are the same size.

It's interesting to reflect on the fact that this comes to 184 bytes
of useful data and 136 bytes of palloc overhead per row ... not sure
if we can do much about the overhead though.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Joe Conway
Date:
Tom Lane wrote:
> Yeah, I've just been doing that and some hand analysis too.  What I get
> (on a 64-bit machine) is that essentially all the space goes into
>
> lists of A_Const lists:        320000000
> lists of Const lists:        320000000
> transformInsertRow extra lists: 144000000
>
> I think we could safely list_free the input list in transformInsertRow
> as your patch suggests, which would buy back the 144M part.  But I don't
> believe it's safe at all to free the raw_parser output --- the grammar
> sometimes makes multiple links to the same subtree, eg in BETWEEN.
> In any case the patch as proposed wouldn't catch all the detritus for
> any case more complicated than a simple integer constant.

:-(

> The way that the list memory usage works (again, 64-bit machine) is
>
> sizeof(List) = 24
> sizeof(ListCell) = 16
> sizeof(A_Const) = 32
>
> Each of these nodes will have 16 bytes palloc overhead, and the List
> header will be rounded up to 32 bytes as well, so we have total space
> for a 3-element integer list of
>     32+16 + (16+16 + 32+16) * 3
> Add in 16+16 for the associated ListCell of the top list-of-lists,
> and you come to 320 bytes per sublist.  Const happens to also be
> 32 bytes so the transformed lists are the same size.

What if we built an array of A_Const nodes instead of a List? Maybe we
could use something akin to appendStringInfo()/enlargeStringInfo() to
build the array of nodes and enlarge it in chunks.

Joe

Re: Values list-of-targetlists patch for comments (was Re:

From
Alvaro Herrera
Date:
Joe Conway wrote:

> What if we built an array of A_Const nodes instead of a List? Maybe we
> could use something akin to appendStringInfo()/enlargeStringInfo() to
> build the array of nodes and enlarge it in chunks.

In inval.c you find this:

/*
 * To minimize palloc traffic, we keep pending requests in successively-
 * larger chunks (a slightly more sophisticated version of an expansible
 * array).  All request types can be stored as SharedInvalidationMessage
 * records.  The ordering of requests within a list is never significant.
 */
typedef struct InvalidationChunk
{
    struct InvalidationChunk *next;     /* list link */
    int         nitems;         /* # items currently stored in chunk */
    int         maxitems;       /* size of allocated array in this chunk */
    SharedInvalidationMessage msgs[1];  /* VARIABLE LENGTH ARRAY */
} InvalidationChunk;            /* VARIABLE LENGTH STRUCTURE */

Which might give you an idea ...

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> What if we built an array of A_Const nodes instead of a List? Maybe we
> could use something akin to appendStringInfo()/enlargeStringInfo() to
> build the array of nodes and enlarge it in chunks.

For lists this short I'm not sure how much of a win it'd be.  It's
interesting though to think about doing something like that within the
List abstraction itself.  We did a "fastlist" hack once before and it
was a crock ... don't want to do that again.  But now that we've got a
distinction between List and ListCell you could imagine that a List
header has a small private array of ListCells ... tuning the size might
be tricky though.

Another thing we could consider is flattening the double-level list into
a single list ... probably be a pain notationally, but it'd save one
List header and one ListCell per VALUES sublist.  And it would offer
more traction for an array-inside-Lists optimization.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Gavin Sherry
Date:
Docs and regression tests attached.

One slightly annoying thing is this:

---
regression=# declare foo cursor with hold for VALUES(1,2), (3, 4);
DECLARE CURSOR
regression=# declare foo2 cursor with hold for (VALUES(1,2), (3, 4)) as
foo(i, j);
ERROR:  syntax error at or near "as"
LINE 1: ...e foo2 cursor with hold for (VALUES(1,2), (3, 4)) as foo(i, ...
---

Now, we can just rewrite the second query as:

---
declare foo2 cursor with hold for select * from (VALUES(1,2), (3, 4)) as
foo(i, j);
---

but it's not immediately obvious. Not worth busting up the grammar for it,
though. And, it's not spec.

Gavin

Attachment

Re: Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Gavin Sherry <swm@linuxworld.com.au> writes:
> Docs and regression tests attached.

I've applied the regression tests (with a few additions), but I'm
feeling dissatisfied with this approach to documenting VALUES.
It seems to be mostly missing the point about VALUES being usable
whereever SELECT is.  I'm not at all sure what I'd do instead though.
Should we give VALUES its own reference page?  That doesn't quite
seem helpful either.  cc'ing to pgsql-docs for ideas.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
Michael Glaesemann
Date:
On Aug 3, 2006, at 23:58 , Tom Lane wrote:


> Should we give VALUES its own reference page?  That doesn't quite
> seem helpful either.
>

I think we should go for a separate reference page, as VALUES appears
to be expanding quite a bit. Up till now I've thought of VALUES only
in conjunction with UPDATE, so perhaps a useful alternative would be
to keep all of the information regarding VALUES and its syntax would
be as a large part of the UPDATE reference page, though that would
imply by placement (even if explained otherwise) that VALUES is only
a part of the UPDATE syntax, which it no longer (?) is. That brings
me back to the idea of VALUES deserving its own reference page.

I wonder how soon pretty much the entire SQL spec will be duplicated
in the PostgreSQL documentation. :)

Michael Glaesemann
grzm seespotcode net


Re: Values list-of-targetlists patch for comments (was Re:

From
Gavin Sherry
Date:
On Thu, 3 Aug 2006, Tom Lane wrote:

> Gavin Sherry <swm@linuxworld.com.au> writes:
> > Docs and regression tests attached.
>
> I've applied the regression tests (with a few additions), but I'm
> feeling dissatisfied with this approach to documenting VALUES.
> It seems to be mostly missing the point about VALUES being usable
> whereever SELECT is.  I'm not at all sure what I'd do instead though.
> Should we give VALUES its own reference page?  That doesn't quite
> seem helpful either.  cc'ing to pgsql-docs for ideas.

Good point. One question: are we happy calling this a 'VALUES list'? It's
better than a 'table value constructor'. I took the lead from a comment in the
source.

Thanks,

gavin

Re: [DOCS] Values list-of-targetlists patch for comments (was Re:

From
Gavin Sherry
Date:
On Fri, 4 Aug 2006, Michael Glaesemann wrote:

> On Aug 3, 2006, at 23:58 , Tom Lane wrote:
>
>
> > Should we give VALUES its own reference page?  That doesn't quite
> > seem helpful either.
> >
>
> I think we should go for a separate reference page, as VALUES appears
> to be expanding quite a bit. Up till now I've thought of VALUES only
> in conjunction with UPDATE, so perhaps a useful alternative would be
> to keep all of the information regarding VALUES and its syntax would
> be as a large part of the UPDATE reference page, though that would
> imply by placement (even if explained otherwise) that VALUES is only
> a part of the UPDATE syntax, which it no longer (?) is. That brings
> me back to the idea of VALUES deserving its own reference page.

... with update? I associate it very closely with INSERT. After all,
INSERT is the only statement where we've had VALUES as part of the
grammar.

Thanks,

Gavin

Re: [DOCS] Values list-of-targetlists patch for comments (was Re:

From
Michael Glaesemann
Date:
On Aug 4, 2006, at 9:42 , Gavin Sherry wrote:

> ... with update? I associate it very closely with INSERT. After all,
> INSERT is the only statement where we've had VALUES as part of the
> grammar.

Of course! Thanks for catching the glitch. I must have a bad RAM chip.

Michael Glaesemann
grzm seespotcode net




Re: [DOCS] Values list-of-targetlists patch for comments (was Re:

From
Tom Lane
Date:
Gavin Sherry <swm@linuxworld.com.au> writes:
> On Fri, 4 Aug 2006, Michael Glaesemann wrote:
>> On Aug 3, 2006, at 23:58 , Tom Lane wrote:
>>> Should we give VALUES its own reference page?  That doesn't quite
>>> seem helpful either.
>>
>> I think we should go for a separate reference page, as VALUES appears
>> to be expanding quite a bit.

> ... with update? I associate it very closely with INSERT. After all,
> INSERT is the only statement where we've had VALUES as part of the
> grammar.

True, but I think that's just a historical artifact.  If you look at the
SQL spec, INSERT ... VALUES and INSERT ... SELECT are not distinct
constructs: they fall out of the fact that VALUES and SELECT are allowed
interchangeably.

         <insert statement> ::=
              INSERT INTO <table name>
                <insert columns and source>

         <insert columns and source> ::=
                [ <left paren> <insert column list> <right paren> ]
              <query expression>
              | DEFAULT VALUES

         <insert column list> ::= <column name list>

and when you trace down <query expression> you find the SELECT
and VALUES options entering at exactly the same place ...

I'd like to see us refactor the docs as necessary to reflect that idea.
Peter is right that this needs some discussion in syntax.sgml as well as
in the reference pages --- but I'm still not very clear on how the
presentation should go.

            regards, tom lane

Re: Values list-of-targetlists patch for comments (was Re:

From
"Pavel Stehule"
Date:
>
>Good point. One question: are we happy calling this a 'VALUES list'? It's
>better than a 'table value constructor'. I took the lead from a comment in
>the
>source.
>

table value constructor is name from ANSI. All people wiil find t.v.c., not
values list. I vote table value constructor.

Regards
Pavel Stehule

_________________________________________________________________
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci.
http://messenger.msn.cz/


Re: Values list-of-targetlists patch for comments (was Re:

From
Simon Riggs
Date:
On Thu, 2006-08-03 at 10:58 -0400, Tom Lane wrote:
> Gavin Sherry <swm@linuxworld.com.au> writes:
> > Docs and regression tests attached.
>
> I've applied the regression tests (with a few additions), but I'm
> feeling dissatisfied with this approach to documenting VALUES.
> It seems to be mostly missing the point about VALUES being usable
> whereever SELECT is.  I'm not at all sure what I'd do instead though.
> Should we give VALUES its own reference page?  That doesn't quite
> seem helpful either.  cc'ing to pgsql-docs for ideas.

The DB2 manual does exactly that and is not any clearer as a result,
even if it is fully normalised.

If we did that we'd need to emphasise that VALUES is more of a clause
than a new command.

--
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com


Re: [HACKERS] [DOCS] Values list-of-targetlists patch for comments (was Re:

From
Peter Eisentraut
Date:
Am Freitag, 4. August 2006 04:50 schrieb Tom Lane:
> I'd like to see us refactor the docs as necessary to reflect that idea.
> Peter is right that this needs some discussion in syntax.sgml as well as
> in the reference pages --- but I'm still not very clear on how the
> presentation should go.

I'm beginning to think that VALUES might be a separate command after all.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: [HACKERS] [DOCS] Values list-of-targetlists patch for comments (was Re:

From
David Fetter
Date:
On Wed, Aug 09, 2006 at 03:05:02PM +0200, Peter Eisentraut wrote:
> Am Freitag, 4. August 2006 04:50 schrieb Tom Lane:
> > I'd like to see us refactor the docs as necessary to reflect that
> > idea.  Peter is right that this needs some discussion in
> > syntax.sgml as well as in the reference pages --- but I'm still
> > not very clear on how the presentation should go.
>
> I'm beginning to think that VALUES might be a separate command after
> all.

What's below definitely bolsters that idea :)

postgres=# VALUES(1);
 column1
---------
       1
(1 row)

However, there are some oddities:

postgres=# SELECT * FROM (VALUES (1,2)) AS foo(bar,baz);
 bar | baz
-----+-----
   1 |   2
(1 row)

postgres=# (VALUES (1,2)) AS foo(bar,baz);
ERROR:  syntax error at or near "AS"
LINE 1: (VALUES (1,2)) AS foo(bar,baz);

Does the SQL standard have anything to say about assigning identifiers
both to the entire VALUES() statement and to its columns when the
VALUES() statement is by itself?

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

David Fetter <david@fetter.org> writes:
> However, there are some oddities:
> postgres=# SELECT * FROM (VALUES (1,2)) AS foo(bar,baz);
> [ works ]
> postgres=# (VALUES (1,2)) AS foo(bar,baz);
> ERROR:  syntax error at or near "AS"

This is per spec.  Effectively, AS is part of the FROM-clause syntax
not part of a standalone command.  You can't write this either:

regression=# (select 1,2) as foo(bar,baz);
ERROR:  syntax error at or near "as"
LINE 1: (select 1,2) as foo(bar,baz);
                     ^

            regards, tom lane