In my application, I receive large blocks of external data that needs
to be extracted / translated / loaded into the db, and many of these
data are duplicates of what's already there.
Consequently, I would like to do efficient "bulk loading" of tables
using multi-row INSERT commands, ignoring unique records that are
already present, where 'uniqueness' is defined by key constraints.
F'rinstance, assume:
CREATE TABLE "weather_observations" ("id" serial primary key,
"station_id" integer, "observation_time" timestamp, "temperature_c"
float)
CREATE UNIQUE INDEX "observation_index" ON "weather_observations"
("station_id", "observation_time")
Now I'd like to be able to do multi-row inserts, but ignoring
duplicate entries (specifically, those that would violate uniqueness
constraint of the index):
INSERT INTO weather (station_id, date, temperature) VALUES
(2257, '2001-01-01', 22.5),
(2257, '2001-01-02', 25.3);
INSERT INTO weather (station_id, date, temperature) VALUES
(2257, '2001-01-02', 25.5), -- ignored: record already present
(2257, '2001-01-03', 21.0);
What's the idiom for doing this in PostgreSQL?
[As an aside, in SQLite, you can modify an INSERT statement with "OR
IGNORE" to achieve this.]
Thanks!