Thread: Fast load
We have 73 databases, two dozen with hundreds of thousands to millions of records, with lengths in the 500-byte range. I'm planning to convert them from Btrieve to PostgreSQL. Of course, I want the highest reasonable speed so that the conversion can be completed - say - in a week-end. My initial take is to create a tab-delimited file and use this as input to the \copy command in psql. Another option might be to feed them directly to the back end. I'm a C programmer, so writing the necessary programs for the conversion and load is not a problem. Any pointers? Not-wanting-to-go-down-a-garden-pathingly, Ted
> We have 73 databases, two dozen with hundreds of thousands to millions of > records, with lengths in the 500-byte range. I'm planning to convert them > from Btrieve to PostgreSQL. > > Of course, I want the highest reasonable speed so that the conversion can be > completed - say - in a week-end. > > My initial take is to create a tab-delimited file and use this as input to > the \copy command in psql. > > Another option might be to feed them directly to the back end. > > I'm a C programmer, so writing the necessary programs for the conversion and > load is not a problem. > > Any pointers? Use \copy. It's very fast at importing tab-delimited text files. Just make sure that what you export from the old database is very "clean", meaning no quotes around text fields, no tabs within a field, that sort of thing. 73 is a lot of databases, but it will probably take you more time to actually type the \copy commands than it will for postgres to copy the data. -- Jeremy [jeremy@wellsgaming.com]
Jeremy: Thanks for the quick reply. That's REALLY encouraging! We're converting them from Btrieve 6.0 to Btrieve 7.0, and it's a real slow process for the big ones. One of the larger Btrieve files blew up the conversion program (from the vendor, no less). The file has 1,296,418 446-byte records. It's been running over two hours and it's at 916,200 records. Every so often it stops to take a big, long rest. Stops for about 10 minutes, and then merrily goes on its way for a while longer. I'm surmising that this is kswapd doing it's thing.
Ted Rolle writes: > We have 73 databases, two dozen with hundreds of thousands to millions of > records, with lengths in the 500-byte range. I'm planning to convert them > >from Btrieve to PostgreSQL. > > Of course, I want the highest reasonable speed so that the conversion can be > completed - say - in a week-end. The fastest possible way to get data loaded into PostgreSQL is to create a tab-delimited file and feed it directly to the backend with the COPY command. To speed things up even more, turn off fsync (-F), create the indexes after loading, and the same with triggers, if you have any. I'd like to think that all of this should take significantly less than a weekend. ;-) Formatting the data into the right format for COPY can be done with your favourite text mashing tools. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter