Re: ECGP - varchar in struct? - Mailing list pgsql-interfaces

From William West
Subject Re: ECGP - varchar in struct?
Date
Msg-id OFD8B3DF27.D1F1608E-ON85256C14.0004FB54@com
Whole thread Raw
In response to ECGP - varchar in struct?  ("William West" <wwest@csc.com>)
Responses Re: ECGP - varchar in struct?
List pgsql-interfaces
I really appreciate your time. I know a lot of what I am saying
must sound silly, due to how low on the learning curve I am.

> exec sql typedef sf struct foo {...};
>
> exec sql begin declare section;
> sf a;
> exec sql end declare section;


This looks promising for what I am trying to do. Is
'exec sql typedef' ecpg specific? I do not remember seeing
it called out as anything special in the books I have looked at?

I have looked at Bruce Momjian's "PostgreSQL - Introduction
and Concepts" book, the Geshwinde-Shonig "Developers Handbook"
put out by SAMS, and the 'PostgreSQL 7.2 Programmers Guide'
from the postgesql.org website.  Anyway, I tried:

exec sql typedef sf struct foo {int intOne; int intTwo};

and I get:

ERROR: parse error, unexpected IDENT at or near "typedef"

from ecpg? I am using ecpg 2.9.0 (Postgres 7.2.1).

===============

> exec sql begin declare section;
> struct foo {...} a;
> exec sql end declare section;

I can see that I have a very fundamental understanding problem
or misconception of some kind, in how one can/ should use structures
inside of an ecpg declare section, and appreciate your patience
(and suppose I am giving you and others a good laugh). I know I am
being prejudiced by what a structure inside a declare section
means to Pro*C, which is "here is the layout of some columns
that I intend to get from or write to a row".

It seems that ecpg does not have any such understanding about
the meaning of a struct and its members and that is (I'm sure)
one of my biggest problems. To see what I am up against, I tried to
make a struct defining two  varchar columns, per your example, but
then I could not for the life of me figure out how to use the struct.

Here is what I setup:

exec sql begin declare section;
struct foo {varchar col1[15]; varchar col2[20];} bar;
exec sql end declare section;

and then made a simple two varchar columns table:

exec sql create table
mytable (col1 varchar(15), col2 varchar(20));

and then checked that this gave me a new row containing Literal1
in col1 and Literal2 in col2:

exec sql insert into mytable values ('Literal1', 'Literal2');

That worked, so then I tried to make ecpg work the way Pro*C works
(bear with me). I setup 'bar' in the declarations section by:

sprintf(bar.col1.arr, "Variable1")
bar.col1.len = strlen(bar.col1.arr);
sprintf(bar.col2.len, "Variable1")
bar.col2.len = strlen(bar.col2.arr);

then tried to load a row with Variable1 in col1 and Variable2
in col2 by:

exec sql insert into mytable values (:bar);

Pro*C would take this to mean to put the present values of
bar.col1 and bar.col2 into columns 1 and 2 of a new row
added to mytable. This builds cleanly in ecpg, seems to emit
a sensible ECPGdo() call, but gives "-201, too many arguments"
at run time. I can't figure out what the library wants ... ECPGdo()
is emitted with two vectors, one to each of the struct members?

I then tried to figure out how to mention the struct members
individually in the values() clause:

exec sql insert into mytable values (:bar.col1.arr, :bar.col2.arr);

But this causes ecpg to issue the error "'bar.col1.arr' is not
declared". I can't figure out what ecpg would want, to be
able to refer to the individual members of the struct?

=============================

So as you can see, I surely have a some kind of very big gap
in my understanding of how one would/ could use structures in
the declare section of ecpg in ways that even remotely
resemble how the existing Pro*C based application now uses
structures in the declare section.

To make my job even harder, Pro*C uses its understanding
of "struct is row"/ "member of struct is column" in order to
support arrays-of-rows exchanges between the front-end
and the back-end.

Pro*C supports an "EXEC SQL FOR :batchsize" construct that
allows you to INSERT, UPDATE, or DELETE many rows
with one "EXEC SQL".  The current application, to exploit this
capability, is full of declarations that are roughly equivalent to:

exec sql begin declare section;
struct foo {varchar col1[15]; varchar col2[20];} bar [ROWS];
exec sql end declare section;

where 'ROWS' establishes the maximum batch size. The Pro*C
understanding of the meaning of a structure in the declare section
(as signifying a row) and its members (as signifying a column) is
fundamental to being able to declare space for batched operations
via C-Language array declaration syntax.

Claims have been made that this can increase an Oracle application's
performance by a factor of 100 or more. Indeed, with our application,
using a modest ':batchsize' (ROWS value), we observe an increase
in performance of a factor of 10 or more over one front-end/ back-end
exchange per insert/ delete/ update.

I know this is already *VERY* long, and if you got this far, thanks.

If you have not lost patience, here is a "simplified as much as possible"
example of how the application is presently organized, in an attempt to
show why the application needs for 'row shaping' to be independent
of 'declare space for row':

/* begin row_shape.h */

typedef struct
{  varchar col1 [15];     ...  varchar coln [140];
} row_shape_t;

/* end row shape .h */

/* begin multi_row_insert.pc */


=== DB INTERFACE COMPILE UNIT ===

#define ROWS_N 50

exec sql include row_shape.h

int
multi_row_insert (msg_t *msg_p)
{
  row_shape_t *row_p;
  exec sql begin declare section;  int row_n;  row_shape_t rowray [ROWS_N];  exec sql end declare section;
  for (row_n = 0, row_p = rowray; row_n < ROWS_N; row_n++, row_p++)  {      if (!load_row_from_msg (msg_p, row_p,
row_n))   break;  }
 
  exec sql begin transaction;
  exec sql whenever sqlerror goto errexit;  /* inserts 'row_n' rows into "table_in_question" */  exec sql for :row_n
insertinto table_in_question     values (:rowray);
 
  exec sql commit work;
  return row_n;

errexit:  exec sql rollback work;  return    -1;
}

/* end multi_row_insert.pc */

=== MESSAGE HANDLING COMPILE UNIT ===

/* begin load_row_from_msg.pc */

exec sql include row_shape.h

/* returns TRUE if message loaded another row_shape_t */
/* returns FALSE if message has no more rows */

bool
load_row_from_msg (msg_t *msg_p, row_shape_t *row_p, int row_n)
{
   /* Message indicates how many rows it has in msg_p->rows_n */   if (msg_p->rows_n < row_n)    return    FALSE;
   memset (row_p, 0, sizeof (*row_p));
  /* Transfers message data for each column of the   *  current row into the current row_shape_t   */
memcpy(row_p->col1.arr,         msg_p[row_n]col1_data,          msg_p[row_n]col1_data_len);   row_p->col1.len =
msg_p[row_n]col1_data_len);    ....   memcpy(row_p->coln.arr,          msg_p[row_n]coln_data,
msg_p[row_n]coln_data_len);  row_p->coln.len = msg_p[row_n]coln_data_len);
 
   return TRUE
}

/* end load_row_from_msg.pc */

============

Lack of the "exec sql for :batchsize" capability is a big problem to
overcome,
and inability to treat structs as 'row shapes' is another big problem to
overcome ... partly because of the inability to pass rows by reference
across
different compile units and partly because there is no longer any way to
clear
all columns in row (prior to insert) by using 'memset()'.

The combined effect of no longer being able to pass-by-reference, needing
to
perform one front-end to back-end  exchange for each row insert/ update/
delete,
and needing to clear each and every column buffer separately puts us in
danger
of being unable to handle the necessary realtime message arrival rates.

This fear of the effects on performance, along with the scope of
application-internal
interface changes that seem to be needed, is what is making me inclined to
consider,
much as I hate it, going to libpq and/ or libpqeasy directly.

I really do need to get going due to deadline pressures, and would hate to
make
a dumb/ bad decision about the basic approach just because of something I
am
missing or do not correctly understand.



pgsql-interfaces by date:

Previous
From: "Rick Gregory"
Date:
Subject: Re: Is this valid?
Next
From: Michael Meskes
Date:
Subject: Re: ECGP - varchar in struct?