Re: [HACKERS] 8.2 features? - Mailing list pgsql-patches

From Joe Conway
Subject Re: [HACKERS] 8.2 features?
Date
Msg-id 44BDB2F5.7080003@joeconway.com
Whole thread Raw
In response to Re: [HACKERS] 8.2 features?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] 8.2 features?  (Bruce Momjian <bruce@momjian.us>)
List pgsql-patches
Tom Lane wrote:
> Christopher Kings-Lynne <chris.kings-lynne@calorieking.com> writes:
>
>>Strange.  Last time I checked I thought MySQL dump used 'multivalue
>>lists in inserts' for dumps, for the same reason that we use COPY
>
> I think Andrew identified the critical point upthread: they don't try
> to put an unlimited number of rows into one INSERT, only a megabyte
> or so's worth.  Typical klugy-but-effective mysql design approach ...


OK, so given that we don't need to be able to do 1 million
multi-targetlist insert statements, here is rev 2 of the patch.

It is just slightly more invasive, but performs *much* better. In fact,
it can handle as many targetlists as you have memory to deal with. It
also deals with DEFAULT values in the targetlist.

I've attached a php script that I used to do crude testing. Basically I
tested 3 cases in this order:

single-INSERT-multi-statement:
------------------------------
   "INSERT INTO foo2a (f1,f2) VALUES (1,2);"
   -- repeat statement $loopcount times

single-INSERT-at-once:
----------------------
   "INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
   VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
   -- build a single SQL string by looping $loopcount times,
   -- and execute it all at once

multi-INSERT-at-once:
---------------------
   "INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
   -- build a single SQL string by looping $loopcount times,
   -- and execute it all at once

Here are the results:
$loopcount = 100000;
single-INSERT-multi-statement Elapsed time is 34 seconds
single-INSERT-at-once Elapsed time is 7 seconds
multi-INSERT-at-once Elapsed time is 4 seconds
about 370MB peak memory usage

$loopcount = 200000;
single-INSERT-multi-statement Elapsed time is 67 seconds
single-INSERT-at-once Elapsed time is 12 seconds
multi-INSERT-at-once Elapsed time is 9 seconds
about 750MB peak memory usage

$loopcount = 300000;
single-INSERT-multi-statement Elapsed time is 101 seconds
single-INSERT-at-once Elapsed time is 18 seconds
multi-INSERT-at-once Elapsed time is 13 seconds
about 1.1GB  peak memory usage

Somewhere beyond this, my machine goes into swap hell, and I didn't have
the patience to wait for it to complete :-)

It would be interesting to see a side-by-side comparison with MySQL
since that seems to be our benchmark on this feature. I'll try to do
that tomorrow if no one beats me to it.

There is only one downside to the current approach that I'm aware of.
The command-result tag is only set by the "original" query, meaning that
even if you insert 300,000 rows using this method, the command-result
tag looks like "INSERT 0 1"; e.g.:

regression=# create table foo2(f1 int default 42,f2 int default 6);
CREATE TABLE
regression=# insert into foo2 (f1,f2) values
(default,12),(default,10),(115,21);
INSERT 0 1
regression=# select * from foo2;
  f1  | f2
-----+----
   42 | 12
   42 | 10
  115 | 21
(3 rows)

Any thoughts on how to fix that?

Thanks,

Joe


Index: src/backend/parser/analyze.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/analyze.c,v
retrieving revision 1.340
diff -c -r1.340 analyze.c
*** src/backend/parser/analyze.c    14 Jul 2006 14:52:21 -0000    1.340
--- src/backend/parser/analyze.c    19 Jul 2006 03:53:35 -0000
***************
*** 657,667 ****
      }
      else
      {
          /*
           * For INSERT ... VALUES, transform the given list of values to form a
!          * targetlist for the INSERT.
           */
!         qry->targetList = transformTargetList(pstate, stmt->targetList);
      }

      /*
--- 657,699 ----
      }
      else
      {
+         ListCell   *tlr;
+
          /*
           * For INSERT ... VALUES, transform the given list of values to form a
!          * targetlist for the INSERT. In a multi-targetlist INSERT, append all
!          * but the first targetlist to extras_after to be processed later by
!          * do_parse_analyze
           */
!         qry->targetList = NIL;
!         foreach(tlr, stmt->targetList)
!         {
!             List *tgtlist = (List *) lfirst(tlr);
!
!             if (qry->targetList == NIL)
!             {
!                 /* transform the first targetlist */
!                 qry->targetList = transformTargetList(pstate, tgtlist);
!             }
!             else
!             {
!                 /*
!                  * Create an InsertStmt node for each additional targetlist
!                  * and append to extras_after
!                  */
!                 InsertStmt *insnode = makeNode(InsertStmt);
!
!                 insnode->cols = NIL;
!                 insnode->targetList = list_make1(tgtlist);
!                 insnode->selectStmt = NULL;
!                 insnode->relation = stmt->relation;
!
!                 if (*extras_after == NIL)
!                     *extras_after = list_make1(insnode);
!                 else
!                     *extras_after = lappend(*extras_after, insnode);
!             }
!         }
      }

      /*
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.551
diff -c -r2.551 gram.y
*** src/backend/parser/gram.y    3 Jul 2006 22:45:39 -0000    2.551
--- src/backend/parser/gram.y    19 Jul 2006 03:53:40 -0000
***************
*** 238,247 ****
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_list def_list indirection opt_indirection
!                 group_clause TriggerFuncArgs select_limit
!                 opt_select_limit opclass_item_list
!                 transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause
--- 238,247 ----
                  qualified_name_list any_name any_name_list
                  any_operator expr_list attrs
                  target_list update_target_list insert_column_list
!                 insert_target_els insert_target_list insert_target_lists
!                 def_list indirection opt_indirection group_clause
!                 TriggerFuncArgs select_limit opt_select_limit
!                 opclass_item_list transaction_mode_list_or_empty
                  TableFuncElementList
                  prep_type_clause prep_type_list
                  execute_param_clause using_clause
***************
*** 5349,5359 ****
          ;

  insert_rest:
!             VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     $$->targetList = $3;
                      $$->selectStmt = NULL;
                  }
              | DEFAULT VALUES
--- 5349,5359 ----
          ;

  insert_rest:
!             VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = NIL;
!                     $$->targetList = $2;
                      $$->selectStmt = NULL;
                  }
              | DEFAULT VALUES
***************
*** 5370,5380 ****
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES '(' insert_target_list ')'
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     $$->targetList = $6;
                      $$->selectStmt = NULL;
                  }
              | '(' insert_column_list ')' SelectStmt
--- 5370,5380 ----
                      $$->targetList = NIL;
                      $$->selectStmt = $1;
                  }
!             | '(' insert_column_list ')' VALUES insert_target_lists
                  {
                      $$ = makeNode(InsertStmt);
                      $$->cols = $2;
!                     $$->targetList = $5;
                      $$->selectStmt = NULL;
                  }
              | '(' insert_column_list ')' SelectStmt
***************
*** 8189,8197 ****

          ;

  insert_target_list:
!             insert_target_el                        { $$ = list_make1($1); }
!             | insert_target_list ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:
--- 8189,8215 ----

          ;

+ insert_target_lists:
+             insert_target_list
+                 {
+                     $$ = list_make1($1);
+                 }
+             | insert_target_lists ',' insert_target_list
+                 {
+                     $$ = lappend($1, $3);
+                 }
+         ;
+
  insert_target_list:
!             '(' insert_target_els ')'
!                 {
!                     $$ = $2;
!                 }
!         ;
!
! insert_target_els:
!             insert_target_el                         { $$ = list_make1($1); }
!             | insert_target_els ',' insert_target_el { $$ = lappend($1, $3); }
          ;

  insert_target_el:

Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] 8.2 features?
Next
From: Martijn van Oosterhout
Date:
Subject: Re: [HACKERS] pg_regress in C