Thread: Loading large amounts of data in a SQL command

Loading large amounts of data in a SQL command

From
frank church
Date:

I am loading lots of data via SQL into a database and wrapping it into
transactions to speed it up.

This fails a number of times and causes the whole transaction to fail. The
queries results are logged so it is easy for me to find problem records.

Is there a setting or feature that allows which allows the same performance as
transactions, without causing the whole process to fail, like a delayed updates
or write mechanism of some sort?

Frank

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Re: Loading large amounts of data in a SQL command

From
Sean Davis
Date:


On 1/5/06 8:31 AM, "frank church" <pgsql@adontendev.net> wrote:

>
>
> I am loading lots of data via SQL into a database and wrapping it into
> transactions to speed it up.
>
> This fails a number of times and causes the whole transaction to fail. The
> queries results are logged so it is easy for me to find problem records.
>
> Is there a setting or feature that allows which allows the same performance as
> transactions, without causing the whole process to fail, like a delayed
> updates
> or write mechanism of some sort?

I typically load into a "loader" table (usually using copy rather than
inserts) that looks like the data rather than what you want the final data
to look like.  For example, if you have an integer field that happens to
contain a couple of non-numeric characters (127a, for example), then load
this column as a varchar.  Then, you can use all of the various regex
commands, coercion functions, etc that postgres has to offer to select from
the loader table into your "clean" production table.  This has the advantage
of being VERY fast, allows you to do a lot of data munging very easily, and
avoids having to continually "clean" the data before it successfully inserts
into the database where you can work with it.

In fact, I am often faced with non-normalized data in one large spreadsheet.
I could use perl or some other client to produce nice inserts into the
appropriate tables, but I find it easier to load the whole thing and then
just do selects to grab the data and put it into normalized form.

Hope that helps,
Sean