> Are you wrapping all this in a transaction?
Yes, the transactions can typically wrap 1 to 10 single-table, single-row
inserts and updates.
> You're doing some dynamically generated selects as part of the
> "de-duping" process? They're probably the expensive bit. What
> do those queries tend to look like?
Without a doubt, this is the expensive bit.
> Are you analysing the table periodically? If not, then you might
> have statistics based on an empty table, or default statistics, which
> might cause the planner to choose bad plans for those selects.
Now there's something I didn't know - I thought that analysis and planning
was done with every select, and the performance benefit of prepared
statements was to plan-once, execute many. I can easily put in a periodic
analyse statement. I obviously missed how to use analyze properluy, I
thought it was just human-readable output - do I understand correctly, that
it can be used to get the SQL server to revaluate its plan based on newer
statistics - even on non-prepared queries?
> Talking of which, are there indexes on the table? Normally you
> wouldn't have indexes in place during a bulk import, but if you're
> doing selects as part of the data load process then you'd be forcing
> sequential scans for every query, which would explain why it gets
> slower as the table gets bigger.
There are indexes for every obvious "where this = that" clauses. I don't
believe that they will work for ilike expressions.
>
> Cheers,
> Steve
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>