Thread: Performance check using COPY commands

Performance check using COPY commands

From
sid tow
Date:
Hi All,
 
     I have a problem in updating a table using copy commands. To brief you about the table, the table has 6 columns in which it has one column for a sequence number which increments by one on each entry. There is a constraint check for one of the columns. I am copying data in all the 6 columns (including the sequence number of my own) in the table, since the version of postgres i am using is 7.2 i cant copy only those columns i want to instead I should copy all the columns present in the table. Now the problem is that the copy is taking a lot of time to bulk copy the data, and I have no idea why is that so. Is it because of the sequence number or some thing else. Since copying data into other table which has relatively more columns is taking much much lesser time may be in the ratio 1:15 times i guess. Can some body suggest me what is happening there?
 
Regards
Sid


Do you Yahoo!?
Yahoo! Search presents - Jib Jab's 'Second Term'

Re: Performance check using COPY commands

From
Nageshwar Rao
Date:

I am trying to load some 5000 records now. It is still running for past 15 minutes, but when I check the table still I do not see records .And there is no status as how many records are inserted. Just left it like that .Also there are couple of issues. I have timestamp columns. In the excel sheet there blanks .I have put null, but it is saying mismatch data type.Any idea how to insert null with copy command for timestamp and date.

 

-----Original Message-----
From: sid tow [mailto:siddy_tow@yahoo.com]
Sent: Tuesday, February 22, 2005 11:35 AM
To: psql mailing list
Subject: [GENERAL] Performance check using COPY commands

 

Hi All,

 

     I have a problem in updating a table using copy commands. To brief you about the table, the table has 6 columns in which it has one column for a sequence number which increments by one on each entry. There is a constraint check for one of the columns. I am copying data in all the 6 columns (including the sequence number of my own) in the table, since the version of postgres i am using is 7.2 i cant copy only those columns i want to instead I should copy all the columns present in the table. Now the problem is that the copy is taking a lot of time to bulk copy the data, and I have no idea why is that so. Is it because of the sequence number or some thing else. Since copying data into other table which has relatively more columns is taking much much lesser time may be in the ratio 1:15 times i guess. Can some body suggest me what is happening there?

 

Regards

Sid


Do you Yahoo!?
Yahoo! Search presents - Jib Jab's 'Second Term'

Re: Performance check using COPY commands

From
Richard Huxton
Date:
Nageshwar Rao wrote:
> I am trying to load some 5000 records now. It is still running for past 15
> minutes, but when I check the table still I do not see records .And there is
> no status as how many records are inserted. Just left it like that.

Without knowing more about the table, your configuration settings and
your hardware it's difficult to say much. Certainly this is a very long
time for 5000 rows - I would expect to spend seconds rather than minutes.

You mentioned a constraint check for one of the other columns. Is it a
foreign-key check and do you have indexes on both ends of the check?

 > Also
> there are couple of issues. I have timestamp columns. In the excel sheet
> there blanks .I have put null, but it is saying mismatch data type.Any idea
> how to insert null with copy command for timestamp and date.

Well, according to the manuals:
   http://www.postgresql.org/docs/7.2/static/sql-copy.html

There is an option "null string"

   The string that represents a NULL value. The default is "\N"
   (backslash-N). You might prefer an empty string, for example.

   Note: On a copy in, any data item that matches this string will be
   stored as a NULL value, so you should make sure that you use the same
   string as you used on copy out.

So - you should put "\N" in your file where you want NULL or specify
some other suitable marker.

--
   Richard Huxton
   Archonet Ltd

Re: Performance check using COPY commands

From
Nageshwar Rao
Date:
Hi,
Can you please let me know what parameters need to be adjusted in order to
complete this operation? It is still running and checked the table, still
there are no records in that. This table does not have any constraints
.Checked the same operation with 4 records it is fine.Only problem is when I
do bulk insert of 5000 records.
Rgds
Rao

-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com]
Sent: Tuesday, February 22, 2005 1:47 PM
To: Nageshwar Rao
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Performance check using COPY commands

Nageshwar Rao wrote:
> I am trying to load some 5000 records now. It is still running for past 15
> minutes, but when I check the table still I do not see records .And there
is
> no status as how many records are inserted. Just left it like that.

Without knowing more about the table, your configuration settings and
your hardware it's difficult to say much. Certainly this is a very long
time for 5000 rows - I would expect to spend seconds rather than minutes.

You mentioned a constraint check for one of the other columns. Is it a
foreign-key check and do you have indexes on both ends of the check?

 > Also
> there are couple of issues. I have timestamp columns. In the excel sheet
> there blanks .I have put null, but it is saying mismatch data type.Any
idea
> how to insert null with copy command for timestamp and date.

Well, according to the manuals:
   http://www.postgresql.org/docs/7.2/static/sql-copy.html

There is an option "null string"

   The string that represents a NULL value. The default is "\N"
   (backslash-N). You might prefer an empty string, for example.

   Note: On a copy in, any data item that matches this string will be
   stored as a NULL value, so you should make sure that you use the same
   string as you used on copy out.

So - you should put "\N" in your file where you want NULL or specify
some other suitable marker.

--
   Richard Huxton
   Archonet Ltd

Re: Performance check using COPY commands

From
Richard Huxton
Date:
Nageshwar Rao wrote:
> Hi,
> Can you please let me know what parameters need to be adjusted in order to
> complete this operation? It is still running and checked the table, still
> there are no records in that. This table does not have any constraints

That is odd then. The only thing that should slow you down would be
testing constraints against another table where there is no suitable
index. Otherwise 5000 rows should take 100 times as long as 50 rows.

> .Checked the same operation with 4 records it is fine.Only problem is when I
> do bulk insert of 5000 records.

Something strange is going on here. I can only think of two possibilities:

1. The table is locked, and the copy is waiting for the lock to be
released. Even so, I would expect it to timeout and return an error by now.
2. There is something wrong with the file.

Option #1
You should be able to see what processes are running from the
command-line with:
   ps auxw | grep postgres
One of the processes should be your copy - what does it say?

Also, check the output of "top" and "vmstat 1" - is there any activity?

If you were running a later version, you could check the locks directly,
but I don't think that's possible with 7.2. If it is "SELECT * FROM
pg_locks" will show you any locks held.

Option #2
If the data is not confidential, feel free to send me a copy of the
table definition and import file and I'll take a look at it. Email it to
me directly, since the mailing list doesn't like large attachments.

Failing that, try splitting the import file into sections, you could use
something like:
   split -l 500 input_file output_file
That will split "input_file" into 500-line chunks. You'll need to copy
the header/footer of the copy command onto each chunk too.
--
   Richard Huxton
   Archonet Ltd