Thread: equivalent of sqlload?
Oracle has a facility for loading in massive data files, sqlload I think is the name. I'd like to load some big data into PostgreSQL 6.4 for testing purposes. Is there a way built in or would a perl or c hack be the way to go? john holland
At 16:45 +0200 on 25/11/98, John Holland wrote: > Oracle has a facility for loading in massive data files, sqlload I think > is the name. I'd like to load some big data into PostgreSQL 6.4 for > testing purposes. Is there a way built in or would a perl or c hack be the > way to go? Use the COPY command. It is preferable not to define indices (or to drop the indices) on the table, then do the massive load with COPY, and then define the indices. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Hello John, mercoledì, 25 novembre 98, you wrote: JH> Oracle has a facility for loading in massive data files, sqlload I think JH> is the name. I'd like to load some big data into PostgreSQL 6.4 for JH> testing purposes. Is there a way built in or would a perl or c hack be the JH> way to go? JH> john holland use: copy [binary] classname [with oids] to|from 'filename'|stdin|stdout [using delimiters 'delim'] or: \copy classname from filename -Jose'-
I am running v6.3.2 under Linux and have found that the "copy" command works only for small amounts of data. When trying to "copy" several thousand records I notice that system RAM and swap space continue to get eaten until there is no further memory available. "psql" then fails. What remains is a .../pgdata/base/XYZ file system with the table being copied into. That table may be several (tens, hundreds) of Meg in size, but a "psql -d XYS -c 'select count(*) table'" will only return a zero count. Now...I have broken the "copy" process down into smaller chunks and make multiple calls to "copy". I have a total of about 5.4 million records and the job isn't done yet...my Pentium 433 has been working on this copy for over 24 hours. I don't know if there are any changes that can be made to speed this type of process up, but this is definitely a black-mark. mike Dr Michael A. Koerber MIT/Lincoln Laboratory 781-981-3250 On Wed, 25 Nov 1998, Herouth Maoz wrote: > At 16:45 +0200 on 25/11/98, John Holland wrote: > > > > Oracle has a facility for loading in massive data files, sqlload I think > > is the name. I'd like to load some big data into PostgreSQL 6.4 for > > testing purposes. Is there a way built in or would a perl or c hack be the > > way to go? > > Use the COPY command. It is preferable not to define indices (or to drop > the indices) on the table, then do the massive load with COPY, and then > define the indices. > > Herouth > > -- > Herouth Maoz, Internet developer. > Open University of Israel - Telem project > http://telem.openu.ac.il/~herutma > > >
At 21:01 +0200 on 25/11/98, Michael A. Koerber wrote: > > Now...I have broken the "copy" process down into smaller chunks and make > multiple calls to "copy". I have a total of about 5.4 million records and > the job isn't done yet...my Pentium 433 has been working on this copy for > over 24 hours. Try putting all the separate COPYs in one transaction. BEGIN TRANSACTION; COPY... COPY... COPY... END; Without the surrounding transaction, each copy is a transaction and requires transaction overhead. I don't know how much time this will save, though. Are you sure you dropped all the indices? When you declare a primary key it declares a unique index, so watch out for that as well. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
> > Now...I have broken the "copy" process down into smaller chunks and make > > multiple calls to "copy". I have a total of about 5.4 million records and > > the job isn't done yet...my Pentium 433 has been working on this copy for > > over 24 hours. I have seen pretty much the same effect, to test this I set up script that copied in 10000 records at a time. The time it took for each copy increased linearly (starting at about 2.0 seconds on my PII-233) by approx 0.5 seconds per 10000. That means that by the time you have 5 million records it takes more than 4 minutes to insert 10000 records. Yes, I checked, there were NO indices defined on the table. Anton