Hi,
I've got a number of files containing generic log data & some of the lines
may or may not be duplicated across files that I'm feeding into a database
using Perl DBI. I'm just ignoring any duplicate record errors. This is fine
for day to day running when the data feeds in at a sensible rate, however,
if I wanted to feed in a load of old data in a short space of time, this
solution simply is not quick enough.
I can modify the feeder script to generate formated CSV files that I can
then COPY into the database into a temporary table. However, I'll then need
to select each record from the temporary table and insert into the main
table, omitting duplicates.
I guess I'd need something like this....
INSERT INTO messages (host, messageid, body, and, loads, more) SELECT host, messageid, body, and, loads, more
FROMmessages_tmp ;
However, when that hit a duplicate, it would fail wouldn't it?
Also, would this actually be any quicker than direct insertion from Perl
DBI?
--
Ian Cass