Re: Bulk Insert / Update / Delete - Mailing list pgsql-general

From Jason Godden
Subject Re: Bulk Insert / Update / Delete
Date
Msg-id 200308212005.23245.jasongodden@optushome.com.au
Whole thread Raw
In response to Re: Bulk Insert / Update / Delete  ("Philip Boonzaaier" <phil@cks.co.za>)
List pgsql-general
OK I understand where you are coming from now... I thought you meant large
batch updates.  You could always just write the data in delimited fromat to
disk and have a static SQL handler to do it but if you want to generate SQL
then in this instance use pl/pgsql:

CREATE OR REPLACE FUNCTION processdata (int4,int4,varchar) RETURNS VARCHAR AS
'
DECLARE
    P_ACCNUM ALIAS FOR $1;
    P_OCCUR ALIAS FOR $2;
    P_PHNUM ALIAS FOR $3;
    V_TYPE VARCHAR;
BEGIN
    IF (P_ACCNUM IS NULL OR P_OCCUR IS NULL OR P_PHNUM IS NULL) THEN
        RAISE EXCEPTION ''Error in parameters'';
    ELSE
        IF ((SELECT COUNT(*) FROM T_SUBSET WHERE "Account_Number" = P_ACCNUM AND
"S_Occurance" = P_OCCUR) = 1) THEN
            UPDATE T_SUBSET SET "Telephone_Number" = P_PHNUM WHERE "Account_Number" =
P_ACCNUM AND "S_Occurance" = P_OCCUR;
            V_TYPE = ''UPDATE'';
        ELSE
            INSERT INTO T_SUBSET VALUES (P_ACCNUM,P_POCCUR,P_PHNUM);
            V_TYPE = ''INSERT'';
        END IF;
    END IF;
    RETURN V_TYPE;
END;
' LANGUAGE 'plpgsql';

I haven't tested or debugged that but you get the idea.  Watch plpgsql's
string escaping (ie - double quoting etc...).  Read the manual about that.
You may also need to check for the existance of the key first but it will
error in the event of that anyway.

Now write to your text file:

SELECT processdata(100,1,'+27543643');
SELECT processdata(101,2,'+27544567');
etc....

and pipe it to psql.  It will return either UPDATE or INSERT depending on
which action it had to take.

Rgds,

Jason

On Fri, 22 Aug 2003 07:13 am, Philip Boonzaaier wrote:
> Hi Ron
>
> I could pursue the pre-compiler route, but I'm actually trying to avoid
> this. The more independant I can make the RDBMS, the better. Step 1 will be
> to put the data in a decently aranged set of tables, Step 2, generate the
> business rules and logic as functions in the database, Step 3, re-write the
> front ends as now 'logic free' data capture windows.
>
> We are using Client / Server tools with the COBOL. So, while the Clients
> are Windows based, the Server is UNIX or LINUX. What we have working so
> far, is the gerneration of the appropriate SQL commands, into a text file,
> with the time incorporated in the name. No more than 60 seconds later, we
> have a UNIX script, running in the background, which picks up the file, and
> executes the batch.
>
> T_SUBSET would be defined as having a FOREIGN KEY linking it to T_MAIN, but
> other that the Key columns, the data in T_SUBSET is unique, and does not
> exist in T_MAIN.
>
> For example, I would CREATE T_MAIN like this :
>
> CREATE TABLE T_MAIN
> (Account_Number INTEGER NOT NULL,
> Account_Name VARCHAR (40),
> PRIMARY KEY (Account_Number));
>
> And T_SUBSET would be :
>
> CREATE TABLE T_SUBSET
> (Account_Number INTEGER NOT NULL,
> S_Occurance INTEGER NOT NULL,
> Telephone_Number VARCHAR(15),
> PRIMARY KEY (Account_Number,S_Occurance),
> FOREIGN KEY(Account_Number)
> REFERENCES T_MAIN(Account_Number));
>
> The Record in COBOL would be simply Account, containing Account Number,
> Name, and Telephone Number Occurs 4 times.
>
> INSERTING a New Account would be no problem, I'd generate SQL INSERT for
> T_MAIN, and for each occurance of Telephone number, where this is not NULL,
> I'd generate an appropriate INSERT into T_SUBSET, corresponding S_Occurance
> with the Occurance number in the COBOL Record..
>
> So far so good.
>
> But when a user changes an existing record, I'd generate an UPDATE T_MAIN
> statement. Now comes the problem. Maybe when the data was first captured,
> only 1 telephone number was entered. Now with the update, phone number 1
> was changed, and phone number 2 was added. So I'd need to check for each of
> them - does this occurance exist ? YES - then Update it, NO, then INSERT
> IT. This would be possible if IF statements were supported in SQL, as I
> would then SELECT * FROM T_SUBSET WHERE Account_Number = <account number>
> AND S_Occurance = < occuarance number>, then IF EXISTS, UPDATE.... ELSE
> INSERT ....,
>
> Jason, your latest email seems to shed some light on this. Would I Then
> HAVE to put the source data into a text file to accomplish this ? Is there
> no INSERT INTO T_SUBSET VALUES (100,1,'+27543643') and then something like
> WHERE NOT EXISTS ( SELECT * FROM T_MAIN WHERE Account_Number = 100 AND
> S_Occurance=1); ???
>
> Any ideas ?
>
> Regards
>
>
> Phil
>
>
>
> ----- Original Message -----
> From: Ron Johnson <ron.l.johnson@cox.net>
> To: PgSQL General ML <pgsql-general@postgresql.org>
> Sent: Thursday, August 21, 2003 10:30 AM
> Subject: Re: [GENERAL] Bulk Insert / Update / Delete
>
> On Thu, 2003-08-21 at 14:37, Philip Boonzaaier wrote:
> > Hi Ron
> >
> > Yeah. I see what you are getting at. However, what about using a RULE ?
>
> This
>
> > seems to fit what I am trying to do.
>
> You mean a PostgreSQL RULE?
>
> > Let me tell you what I am doing at the moment. I am migrating a COBOL
>
> based
>
> > system to a RDBMS base, and eventually a Perl / Java / Whatever front
> > end.
>
> Well, gee, there are pre-compilers floating around that let you
> embed SQL in COBOL.  Unfortunately, non of them are OSS...
>
> > As Phase 1, I am simple replicating the data in PostgreSQL. I have
> > created tables identical to the 'records' in COBOL. When I INSERT in
> > COBOL, I
>
> create
>
> > an INSERT in SQL and action this. This is done externally from COBOL, and
>
> Externally from COBOL?  You mean in some lashed-together batch
> mode operation?
>
> > not using any embedded SQL features. Similarly with UPDATE. However, I
> > now want to create a Table based on a sub - set of information, in the
> > record in the first attempt, I am creating a table of Telephone numbers
> > for an account, which is currently defined as an array of 4 possibilities
> > within the account record. ). Now, when UPDATING the main row, I have no
> > idea if the sub - set of information is already in the database, or not.
> > So I want to, simply by writing a SQL statement, INSERT or UPDATE the
> > information in the database.
>
> Give the name T_SUBSET to this sub-set table, and T_MAIN to the
> main table. Original, eh?
>
> Thus, for a given tuple in the main row, some pseudo-code:
>
>   UPDATE t_main AS m
>   SET m.field1 = ss.field1,
>       m.field2 = ss.field2
>   FROM t_subset AS ss
>   WHERE m.field3 = ss.field3
>     AND m.field4 = ss.field4
>     AND ss.field3 = ??
>     AND ss.field4 = ?? ;
>
>   IF zero rows updated THEN
>     INSERT INTO T_MAIN VALUES (blah, blah, blah);
>   END IF
>
> If the number of parameters that you'd need to send is a reasonable
> amount, then you could encapsulate the code into a trigger, thus
> simplifying the Perl / Java / Whatever code.
>
> > Regards
> >
> > Phil
> >
> > ----- Original Message -----
> > From: Ron Johnson <ron.l.johnson@cox.net>
> > To: PgSQL General ML <pgsql-general@postgresql.org>
> > Sent: Thursday, August 21, 2003 9:01 AM
> > Subject: Re: [GENERAL] Bulk Insert / Update / Delete
> >
> > On Thu, 2003-08-21 at 13:33, Philip Boonzaaier wrote:
> > > Hi Ron
> > >
> > > That is just the point. If Postgres cannot tell me which records exist
>
> and
>
> > > need updating, and which do not and need inserting, then what can ?
> > >
> > > In the old world of indexed ISAM files it is very simple - try to get
>
> the
>
> > > record ( row ) by primary key. If it is there, update it, if it is not,
> > > insert it.
> >
> > SQL (and, by extension, the relational DBMS) isn't magic.  It just
> > makes it easier to do what we did is the "old world of indexed ISAM"
> > files.
> >
> > > Now, one can do this with a higher level language and SQL combined, but
>
> is
>
> > > SQL that weak ?
> >
> > No, not weak.  See below.
> >
> > > What happens when you merge two tables ? Surely SQL must somehow
>
> determine
>
> > > what needs INSERTING and what needs UPDATING.... Or does one try to
>
> merge,
>
> > > get a failure, an resort to writing something in Perl or C ?
> >
> > In this case, SQL will make it easier to tell you what's there,
> > and, if the "comparison data" is loaded into a separate table,
> > what's not there.
> >
> > So, yes, you will almost certainly need an "outer" language (C,
> > Perl, Python, Tck/Tk, Java, etc).  However, you'll need less
> > lines of the outer language if you use SQL.
> >
> > For example, if you use dumb old ISAM files, the most you can do
> > is specify which index key you want the file sorted on before fetching
> > *each* *row* *in* *the* *file*, and tough noogies if there are
> > 100M rows in it.  And then you must code in IF statements to
> > skip over any records that don't meet your criteria.  This is
> > just adds more SLOC, thereby increasing the likelihood of bugs.
> >
> > With SQL, however, you embed the winnowing criteria as predicates
> > in the WHERE clause, or maybe even the FROM clause, if you need
> > certain kinds of sub-selects.
> >
> > If you think in terms of guns, SQL is a machine gun, thus giving
> > great firepower/usefullness to the programmer.  However, it doesn't
> > shoot silver bullets...
> >
> > Make any sense?
> >
> > > Please help to un - confuse me !
> > >
> > > Regards
> > >
> > > Phil
> > > ----- Original Message -----
> > > From: Ron Johnson <ron.l.johnson@cox.net>
> > > To: PgSQL General ML <pgsql-general@postgresql.org>
> > > Sent: Tuesday, August 19, 2003 6:45 PM
> > > Subject: Re: [GENERAL] Bulk Insert / Update / Delete
> > >
> > > On Tue, 2003-08-19 at 22:03, Philip Boonzaaier wrote:
> > > > Hi Jason
> > > >
> > > > Thanks for your prompt response.
> > > >
> > > > I'm pretty new to SQL, so please excuse the following rather stupid
> > >
> > > question
> > >
> > > > How do I use the (NOT) EXISTS or (NOT) IN clause ? Would it be
>
> feasible,
>
> > > > using your suggestion, to simply put in two SQL statements, in the
>
> same
>
> > > > query - first UPDATE when EXISTS, then INSERT when NOT EXISTS, to
> > >
> > > accomplist
> > >
> > > > this in one go ?
> > > >
> > > > Regards
> > > >
> > > > Phil
> > >
> > > How will you which records were updated, thus able to know which need
> > > to be inserted?
> > >
> > > A temporary table and pl/pgsql should do the trick.
> > >
> > > > ----- Original Message -----
> > > > From: Jason Godden <jasongodden@optushome.com.au>
> > > > To: Philip Boonzaaier <phil@cks.co.za>;
> > > > <pgsql-general@postgresql.org> Sent: Tuesday, August 19, 2003 4:42 PM
> > > > Subject: Re: [GENERAL] Bulk Insert / Update / Delete
> > > >
> > > >
> > > > Hi Philip,
> > > >
> > > > Pg is more ansi compliant than most (GoodThing (TM)).  You can use
> > > > the 'when'
> > > > conditional but not to do what you need.  If I understand you
>
> correclty
>
> > > you
> > >
> > > > should be able to acheive the same result using two seperate queries
>
> and
>
> > > the
> > >
> > > > (NOT) EXISTS or (NOT) IN clause.  Failing that have a look at the
> > > > fine
> > >
> > > docs
> > >
> > > > on pl/pgsql and other postgresql procedural languages which allow you
>
> to
>
> > > use
> > >
> > > > loops and conditional statements like 'if'.
> > > >
> > > > Rgds,
> > > >
> > > > J
> > > >
> > > > On Wed, 20 Aug 2003 12:21 pm, Philip Boonzaaier wrote:
> > > > > I want to be able to generate SQL statements that will go through a
> >
> > list
> >
> > > > of
> > > >
> > > > > data, effectively row by row, enquire on the database if this
> > > > > exists
> >
> > in
> >
> > > > the
> > > >
> > > > > selected table- If it exists, then the colums must be UPDATED, if
>
> not,
>
> > > > they
> > > >
> > > > > must be INSERTED.
> > > > >
> > > > > Logically then, I would like to SELECT * FROM <TABLE>
> > > > > WHERE ....<Values entered here>, and then IF FOUND
> > > > > UPDATE <TABLE> SET .... <Values entered here> ELSE
> > > > > INSERT INTO <TABLE> VALUES <Values entered here>
> > > > > END IF;
> > > > >
> > > > > The IF statement gets rejected by the parser. So it would appear
>
> that
>
> > > > > PostgreSQL does not support an IF in this type of query, or maybe
>
> not
>
> > at
> >
> > > > > all.
> > > > >
> > > > > Does anyone have any suggestions as to how I can achieve this ?
>
> --
> -----------------------------------------------------------------
> Ron Johnson, Jr. ron.l.johnson@cox.net
> Jefferson, LA USA
>
> "Whatever may be the moral ambiguities of the so-called
> demoratic nations and however serious may be their failure to
> conform perfectly to their democratic ideals, it is sheer moral
> perversity to equate the inconsistencies of a democratic
> civilization with the brutalities which modern tyrannical states
> practice."
> Reinhold Nieburhr, ca. 1940
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
> This message is privileged and confidential and intended for the addressee
> only. If you are not the intended recipient you may not disclose, copy or
> in any way use or publish the content hereof, which is subject to
> copyright.If you have received this in error, please destroy the original
> message and contact us at postmaster@cks.co.za. Any views expressed in this
> message are those of the individual sender, except where the sender
> specifically states them to be the view of Computerkit Retail Systems, its
> subsidiaries or associates. Please note that the recipient must scan this
> e-mail and attachments for  viruses. We accept no liability of whatever
> nature for any loss, liability,damage or expense resulting directly or
> indirectly from this transmission of this message and/or attachments.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend


pgsql-general by date:

Previous
From: "Philip Boonzaaier"
Date:
Subject: Re: Bulk Insert / Update / Delete
Next
From: "Shridhar Daithankar"
Date:
Subject: Re: Buglist