On 23/05/11 12:09, Aren Cambre wrote:
> Also, thanks for the advice on batching my queries. I am now using a
> very efficient bulk data read and write methods for Postgres.
>
> My program bulk reads 100,000 rows, processes those rows (during which
> it does a few SELECTs), and then writes 100,000 rows at a time.
>
> It cycles through this until it has processed all 12,000,000 rows.
>
> This, plus the parallelism fix, will probably convert this 30 hour
> program to a <2 hour program.
It's always good to hear when these things work out. Thanks for
reporting back.
Using the set-based nature of relational databases to your advantage,
writing smarter queries that do more work server-side with fewer
round-trips, and effective batching can make a huge difference.
--
Craig Ringer