Re: ECGP - varchar in struct? - Mailing list pgsql-interfaces
From | William West |
---|---|
Subject | Re: ECGP - varchar in struct? |
Date | |
Msg-id | OFD8B3DF27.D1F1608E-ON85256C14.0004FB54@com Whole thread Raw |
In response to | ECGP - varchar in struct? ("William West" <wwest@csc.com>) |
Responses |
Re: ECGP - varchar in struct?
|
List | pgsql-interfaces |
I really appreciate your time. I know a lot of what I am saying must sound silly, due to how low on the learning curve I am. > exec sql typedef sf struct foo {...}; > > exec sql begin declare section; > sf a; > exec sql end declare section; This looks promising for what I am trying to do. Is 'exec sql typedef' ecpg specific? I do not remember seeing it called out as anything special in the books I have looked at? I have looked at Bruce Momjian's "PostgreSQL - Introduction and Concepts" book, the Geshwinde-Shonig "Developers Handbook" put out by SAMS, and the 'PostgreSQL 7.2 Programmers Guide' from the postgesql.org website. Anyway, I tried: exec sql typedef sf struct foo {int intOne; int intTwo}; and I get: ERROR: parse error, unexpected IDENT at or near "typedef" from ecpg? I am using ecpg 2.9.0 (Postgres 7.2.1). =============== > exec sql begin declare section; > struct foo {...} a; > exec sql end declare section; I can see that I have a very fundamental understanding problem or misconception of some kind, in how one can/ should use structures inside of an ecpg declare section, and appreciate your patience (and suppose I am giving you and others a good laugh). I know I am being prejudiced by what a structure inside a declare section means to Pro*C, which is "here is the layout of some columns that I intend to get from or write to a row". It seems that ecpg does not have any such understanding about the meaning of a struct and its members and that is (I'm sure) one of my biggest problems. To see what I am up against, I tried to make a struct defining two varchar columns, per your example, but then I could not for the life of me figure out how to use the struct. Here is what I setup: exec sql begin declare section; struct foo {varchar col1[15]; varchar col2[20];} bar; exec sql end declare section; and then made a simple two varchar columns table: exec sql create table mytable (col1 varchar(15), col2 varchar(20)); and then checked that this gave me a new row containing Literal1 in col1 and Literal2 in col2: exec sql insert into mytable values ('Literal1', 'Literal2'); That worked, so then I tried to make ecpg work the way Pro*C works (bear with me). I setup 'bar' in the declarations section by: sprintf(bar.col1.arr, "Variable1") bar.col1.len = strlen(bar.col1.arr); sprintf(bar.col2.len, "Variable1") bar.col2.len = strlen(bar.col2.arr); then tried to load a row with Variable1 in col1 and Variable2 in col2 by: exec sql insert into mytable values (:bar); Pro*C would take this to mean to put the present values of bar.col1 and bar.col2 into columns 1 and 2 of a new row added to mytable. This builds cleanly in ecpg, seems to emit a sensible ECPGdo() call, but gives "-201, too many arguments" at run time. I can't figure out what the library wants ... ECPGdo() is emitted with two vectors, one to each of the struct members? I then tried to figure out how to mention the struct members individually in the values() clause: exec sql insert into mytable values (:bar.col1.arr, :bar.col2.arr); But this causes ecpg to issue the error "'bar.col1.arr' is not declared". I can't figure out what ecpg would want, to be able to refer to the individual members of the struct? ============================= So as you can see, I surely have a some kind of very big gap in my understanding of how one would/ could use structures in the declare section of ecpg in ways that even remotely resemble how the existing Pro*C based application now uses structures in the declare section. To make my job even harder, Pro*C uses its understanding of "struct is row"/ "member of struct is column" in order to support arrays-of-rows exchanges between the front-end and the back-end. Pro*C supports an "EXEC SQL FOR :batchsize" construct that allows you to INSERT, UPDATE, or DELETE many rows with one "EXEC SQL". The current application, to exploit this capability, is full of declarations that are roughly equivalent to: exec sql begin declare section; struct foo {varchar col1[15]; varchar col2[20];} bar [ROWS]; exec sql end declare section; where 'ROWS' establishes the maximum batch size. The Pro*C understanding of the meaning of a structure in the declare section (as signifying a row) and its members (as signifying a column) is fundamental to being able to declare space for batched operations via C-Language array declaration syntax. Claims have been made that this can increase an Oracle application's performance by a factor of 100 or more. Indeed, with our application, using a modest ':batchsize' (ROWS value), we observe an increase in performance of a factor of 10 or more over one front-end/ back-end exchange per insert/ delete/ update. I know this is already *VERY* long, and if you got this far, thanks. If you have not lost patience, here is a "simplified as much as possible" example of how the application is presently organized, in an attempt to show why the application needs for 'row shaping' to be independent of 'declare space for row': /* begin row_shape.h */ typedef struct { varchar col1 [15]; ... varchar coln [140]; } row_shape_t; /* end row shape .h */ /* begin multi_row_insert.pc */ === DB INTERFACE COMPILE UNIT === #define ROWS_N 50 exec sql include row_shape.h int multi_row_insert (msg_t *msg_p) { row_shape_t *row_p; exec sql begin declare section; int row_n; row_shape_t rowray [ROWS_N]; exec sql end declare section; for (row_n = 0, row_p = rowray; row_n < ROWS_N; row_n++, row_p++) { if (!load_row_from_msg (msg_p, row_p, row_n)) break; } exec sql begin transaction; exec sql whenever sqlerror goto errexit; /* inserts 'row_n' rows into "table_in_question" */ exec sql for :row_n insertinto table_in_question values (:rowray); exec sql commit work; return row_n; errexit: exec sql rollback work; return -1; } /* end multi_row_insert.pc */ === MESSAGE HANDLING COMPILE UNIT === /* begin load_row_from_msg.pc */ exec sql include row_shape.h /* returns TRUE if message loaded another row_shape_t */ /* returns FALSE if message has no more rows */ bool load_row_from_msg (msg_t *msg_p, row_shape_t *row_p, int row_n) { /* Message indicates how many rows it has in msg_p->rows_n */ if (msg_p->rows_n < row_n) return FALSE; memset (row_p, 0, sizeof (*row_p)); /* Transfers message data for each column of the * current row into the current row_shape_t */ memcpy(row_p->col1.arr, msg_p[row_n]col1_data, msg_p[row_n]col1_data_len); row_p->col1.len = msg_p[row_n]col1_data_len); .... memcpy(row_p->coln.arr, msg_p[row_n]coln_data, msg_p[row_n]coln_data_len); row_p->coln.len = msg_p[row_n]coln_data_len); return TRUE } /* end load_row_from_msg.pc */ ============ Lack of the "exec sql for :batchsize" capability is a big problem to overcome, and inability to treat structs as 'row shapes' is another big problem to overcome ... partly because of the inability to pass rows by reference across different compile units and partly because there is no longer any way to clear all columns in row (prior to insert) by using 'memset()'. The combined effect of no longer being able to pass-by-reference, needing to perform one front-end to back-end exchange for each row insert/ update/ delete, and needing to clear each and every column buffer separately puts us in danger of being unable to handle the necessary realtime message arrival rates. This fear of the effects on performance, along with the scope of application-internal interface changes that seem to be needed, is what is making me inclined to consider, much as I hate it, going to libpq and/ or libpqeasy directly. I really do need to get going due to deadline pressures, and would hate to make a dumb/ bad decision about the basic approach just because of something I am missing or do not correctly understand.
pgsql-interfaces by date: