Thread: Please help - tks

Please help - tks

From
Mr OCP
Date:
Hi Folks

We have about 300,000 records to update, I have
created a file with the ids of the records to be
updated, since the number of records to be updated is
very large, the update statement takes forever and
server slows down dramatically, therefore I want to do
one at a time.

Do you guys have any script which may update records
one by one using the file that lists these records. I
am running postgresql sever under Unix, any kind of
script would do, whether its a perl, bash or sql
whatever script

Thanks for your help
Mike

http://briefcase.yahoo.com.au - Yahoo! Briefcase
- Manage your files online.

Re: Please help - tks

From
Brett Schwarz
Date:
A quick and dirty solution would be something like (untested):

while read line; do
    psql dbname -c "update table  ... use 'line' as your id and data"
    sleep 5
done < filename

where "filename" is the file with the ids in it. "dbname" is the name of your DB (of course!)

I am not sure what your file looks like, but you can do something like this. It could be easy, or harder, depending on
whatyour file has. 

adjust the sleep to your own specs.

--brett


P.S.

You could test this out by doing:

while read line; do
    psql dbname -c "select * from table where id=$line"
    sleep 5
done < filename



On Thu, 18 Oct 2001 16:35:01 +1000 (EST)
Mr OCP <mr_ocp@yahoo.com> wrote:

> Hi Folks
>
> We have about 300,000 records to update, I have
> created a file with the ids of the records to be
> updated, since the number of records to be updated is
> very large, the update statement takes forever and
> server slows down dramatically, therefore I want to do
> one at a time.
>
> Do you guys have any script which may update records
> one by one using the file that lists these records. I
> am running postgresql sever under Unix, any kind of
> script would do, whether its a perl, bash or sql
> whatever script
>
> Thanks for your help
> Mike
>
> http://briefcase.yahoo.com.au - Yahoo! Briefcase
> - Manage your files online.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: Please help - tks

From
Joel Burton
Date:
On Thu, 18 Oct 2001, [iso-8859-1] Mr OCP wrote:

> Hi Folks
>
> We have about 300,000 records to update, I have
> created a file with the ids of the records to be
> updated, since the number of records to be updated is
> very large, the update statement takes forever and
> server slows down dramatically, therefore I want to do
> one at a time.
>
> Do you guys have any script which may update records
> one by one using the file that lists these records. I
> am running postgresql sever under Unix, any kind of
> script would do, whether its a perl, bash or sql
> whatever script
>
> Thanks for your help
> Mike

On the really Q&D, can you split (man split) your ID
file into a few pieces, PG COPY this to temp tables (say 10 or 20), and
use these smaller tables for your updates?

--

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant


Re: Please help - tks

From
Peter Eisentraut
Date:
Mr OCP writes:

> We have about 300,000 records to update, I have created a file with
> the ids of the records to be updated, since the number of records to
> be updated is very large, the update statement takes forever and
> server slows down dramatically, therefore I want to do one at a time.

That's going to be *much* slower yet.  What's your point?

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter


Re: Please help - tks

From
Brian McCane
Date:
Much too quick and dirty.  With 300,000 updates this will take 1.5million
seconds, roughly 200 days.  Even at a rate of 1 a second it will take
nearly four days.  All this aside, I would probably use a perl script and
do batches of about 10-20, with a sleep.  You can try something like the
following and feed it using standard input.  Please NOTE, I didn't test
this, and even if I did, you shouldn't trust me :).

- brian

Wm. Brian McCane                    | Life is full of doors that won't open
Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

#! /usr/bin/perl

use strict ;

my @updates ;

while (my $line = <STDIN>)
{
  my $cmd = "UPDATE table SET ......." ;    # Use the line we just read
  push(@updates, $cmd) ;
  if (@updates == 10)
  {
    do_update() ;
    sleep 5 ;
    @updates = () ;                # Redundant
  }
}
do_update if (@updates) ;

sub do_update
{
  open PSQL, "| psql dbname" ;
  print PSQL "begin transaction ;\n" ;
  while (my $cmd = shift @updates)
  {
    print PSQL $cmd . "\n" ;
  }
  print PSQL "commit ;\n" ;
  close PSQL ;
}

__END__


On Thu, 18 Oct 2001, Brett Schwarz wrote:

> A quick and dirty solution would be something like (untested):
>
> while read line; do
>     psql dbname -c "update table  ... use 'line' as your id and data"
>     sleep 5
> done < filename
>
> where "filename" is the file with the ids in it. "dbname" is the name of your DB (of course!)
>
> I am not sure what your file looks like, but you can do something like this. It could be easy, or harder, depending
onwhat your file has. 
>
> adjust the sleep to your own specs.
>
> --brett
>
>
> P.S.
>
> You could test this out by doing:
>
> while read line; do
>     psql dbname -c "select * from table where id=$line"
>     sleep 5
> done < filename
>
>
>
> On Thu, 18 Oct 2001 16:35:01 +1000 (EST)
> Mr OCP <mr_ocp@yahoo.com> wrote:
>
> > Hi Folks
> >
> > We have about 300,000 records to update, I have
> > created a file with the ids of the records to be
> > updated, since the number of records to be updated is
> > very large, the update statement takes forever and
> > server slows down dramatically, therefore I want to do
> > one at a time.
> >
> > Do you guys have any script which may update records
> > one by one using the file that lists these records. I
> > am running postgresql sever under Unix, any kind of
> > script would do, whether its a perl, bash or sql
> > whatever script
> >
> > Thanks for your help
> > Mike
> >
> > http://briefcase.yahoo.com.au - Yahoo! Briefcase
> > - Manage your files online.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/users-lounge/docs/faq.html
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>


Re: Please help - tks

From
Brett Schwarz
Date:
????

This is why I said this:
> > adjust the sleep to your own specs.

If he wants to "batch" them, then all he needs to do is add a condition/count:

--------------------------------
#!/bin/sh

cnt=1
while read line; do

    # "batches" of 20 or whatever you *want*
    if [ "X$cnt" == "X20" ]; then
        sleep 5   # whatever you *want*
        cnt=0
    fi

    psql dbname -c "UPDATE ... $line ..."
    let cnt=$cnt+1

done < filename

---------------------------------

I will leave it up to the reader to modify this to his/her *specific* need...

--brett




On Thu, 18 Oct 2001 20:07:32 -0500 (CDT)
Brian McCane <bmccane@mccons.net> wrote:

> Much too quick and dirty.  With 300,000 updates this will take 1.5million
> seconds, roughly 200 days.  Even at a rate of 1 a second it will take
> nearly four days.  All this aside, I would probably use a perl script and
> do batches of about 10-20, with a sleep.  You can try something like the
> following and feed it using standard input.  Please NOTE, I didn't test
> this, and even if I did, you shouldn't trust me :).
>
> - brian
>
> Wm. Brian McCane                    | Life is full of doors that won't open
> Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
> Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
> Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"
>
> #! /usr/bin/perl
>
> use strict ;
>
> my @updates ;
>
> while (my $line = <STDIN>)
> {
>   my $cmd = "UPDATE table SET ......." ;    # Use the line we just read
>   push(@updates, $cmd) ;
>   if (@updates == 10)
>   {
>     do_update() ;
>     sleep 5 ;
>     @updates = () ;                # Redundant
>   }
> }
> do_update if (@updates) ;
>
> sub do_update
> {
>   open PSQL, "| psql dbname" ;
>   print PSQL "begin transaction ;\n" ;
>   while (my $cmd = shift @updates)
>   {
>     print PSQL $cmd . "\n" ;
>   }
>   print PSQL "commit ;\n" ;
>   close PSQL ;
> }
>
> __END__
>
>
> On Thu, 18 Oct 2001, Brett Schwarz wrote:
>
> > A quick and dirty solution would be something like (untested):
> >
> > while read line; do
> >     psql dbname -c "update table  ... use 'line' as your id and data"
> >     sleep 5
> > done < filename
> >
> > where "filename" is the file with the ids in it. "dbname" is the name of your DB (of course!)
> >
> > I am not sure what your file looks like, but you can do something like this. It could be easy, or harder, depending
onwhat your file has. 
> >
> > adjust the sleep to your own specs.
> >
> > --brett
> >
> >
> > P.S.
> >
> > You could test this out by doing:
> >
> > while read line; do
> >     psql dbname -c "select * from table where id=$line"
> >     sleep 5
> > done < filename
> >
> >
> >
> > On Thu, 18 Oct 2001 16:35:01 +1000 (EST)
> > Mr OCP <mr_ocp@yahoo.com> wrote:
> >
> > > Hi Folks
> > >
> > > We have about 300,000 records to update, I have
> > > created a file with the ids of the records to be
> > > updated, since the number of records to be updated is
> > > very large, the update statement takes forever and
> > > server slows down dramatically, therefore I want to do
> > > one at a time.
> > >
> > > Do you guys have any script which may update records
> > > one by one using the file that lists these records. I
> > > am running postgresql sever under Unix, any kind of
> > > script would do, whether its a perl, bash or sql
> > > whatever script
> > >
> > > Thanks for your help
> > > Mike
> > >
> > > http://briefcase.yahoo.com.au - Yahoo! Briefcase
> > > - Manage your files online.
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> > _________________________________________________________
> > Do You Yahoo!?
> > Get your free @yahoo.com address at http://mail.yahoo.com
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


Re: Please help - tks

From
Brian McCane
Date:
Brett,

  It appears that I may have upset you somehow.  I didn't intend to.  I
chose to write the PERL script to place them in a transaction to help
improve performance a little bit on each pass.  By building and tearing
down a connection to the postmaster everytime you send an update, you will
be putting a much heavier load on the server than is necessary.  If I was
using the script myself, I would make the connection to the postmaster
once at the top of the script, and then re-use it with a transaction block
for each group of 10/20.  Then shut it down at the end.  This way I do not
suffer any additional protocol overhead each time I want to use it.

- brian


On Thu, 18 Oct 2001, Brett Schwarz wrote:

> ????
>
> This is why I said this:
> > > adjust the sleep to your own specs.
>
> If he wants to "batch" them, then all he needs to do is add a condition/count:
>
> --------------------------------
> #!/bin/sh
>
> cnt=1
> while read line; do
>
>     # "batches" of 20 or whatever you *want*
>     if [ "X$cnt" == "X20" ]; then
>         sleep 5   # whatever you *want*
>         cnt=0
>     fi
>
>     psql dbname -c "UPDATE ... $line ..."
>     let cnt=$cnt+1
>
> done < filename
>
> ---------------------------------
>
> I will leave it up to the reader to modify this to his/her *specific* need...
>
> --brett
>
>
>
>
> On Thu, 18 Oct 2001 20:07:32 -0500 (CDT)
> Brian McCane <bmccane@mccons.net> wrote:
>
> > Much too quick and dirty.  With 300,000 updates this will take 1.5million
> > seconds, roughly 200 days.  Even at a rate of 1 a second it will take
> > nearly four days.  All this aside, I would probably use a perl script and
> > do batches of about 10-20, with a sleep.  You can try something like the
> > following and feed it using standard input.  Please NOTE, I didn't test
> > this, and even if I did, you shouldn't trust me :).
> >
> > - brian
> >
> > Wm. Brian McCane                    | Life is full of doors that won't open
> > Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
> > Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
> > Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"
> >
> > #! /usr/bin/perl
> >
> > use strict ;
> >
> > my @updates ;
> >
> > while (my $line = <STDIN>)
> > {
> >   my $cmd = "UPDATE table SET ......." ;    # Use the line we just read
> >   push(@updates, $cmd) ;
> >   if (@updates == 10)
> >   {
> >     do_update() ;
> >     sleep 5 ;
> >     @updates = () ;                # Redundant
> >   }
> > }
> > do_update if (@updates) ;
> >
> > sub do_update
> > {
> >   open PSQL, "| psql dbname" ;
> >   print PSQL "begin transaction ;\n" ;
> >   while (my $cmd = shift @updates)
> >   {
> >     print PSQL $cmd . "\n" ;
> >   }
> >   print PSQL "commit ;\n" ;
> >   close PSQL ;
> > }
> >
> > __END__
> >
> >
> > On Thu, 18 Oct 2001, Brett Schwarz wrote:
> >
> > > A quick and dirty solution would be something like (untested):
> > >
> > > while read line; do
> > >     psql dbname -c "update table  ... use 'line' as your id and data"
> > >     sleep 5
> > > done < filename
> > >
> > > where "filename" is the file with the ids in it. "dbname" is the name of your DB (of course!)
> > >
> > > I am not sure what your file looks like, but you can do something like this. It could be easy, or harder,
dependingon what your file has. 
> > >
> > > adjust the sleep to your own specs.
> > >
> > > --brett
> > >
> > >
> > > P.S.
> > >
> > > You could test this out by doing:
> > >
> > > while read line; do
> > >     psql dbname -c "select * from table where id=$line"
> > >     sleep 5
> > > done < filename
> > >
> > >
> > >
> > > On Thu, 18 Oct 2001 16:35:01 +1000 (EST)
> > > Mr OCP <mr_ocp@yahoo.com> wrote:
> > >
> > > > Hi Folks
> > > >
> > > > We have about 300,000 records to update, I have
> > > > created a file with the ids of the records to be
> > > > updated, since the number of records to be updated is
> > > > very large, the update statement takes forever and
> > > > server slows down dramatically, therefore I want to do
> > > > one at a time.
> > > >
> > > > Do you guys have any script which may update records
> > > > one by one using the file that lists these records. I
> > > > am running postgresql sever under Unix, any kind of
> > > > script would do, whether its a perl, bash or sql
> > > > whatever script
> > > >
> > > > Thanks for your help
> > > > Mike
> > > >
> > > > http://briefcase.yahoo.com.au - Yahoo! Briefcase
> > > > - Manage your files online.
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 5: Have you checked our extensive FAQ?
> > > >
> > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > >
> > > _________________________________________________________
> > > Do You Yahoo!?
> > > Get your free @yahoo.com address at http://mail.yahoo.com
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 3: if posting/reading through Usenet, please send an appropriate
> > > subscribe-nomail command to majordomo@postgresql.org so that your
> > > message can get through to the mailing list cleanly
> > >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
> _________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at http://mail.yahoo.com
>


Re: Please help - tks

From
Brett Schwarz
Date:
Sorry, my bad. I thought your whole point was in regards to the sleep command (which I really didn't understand why).
However,your point about stressing the server is a very valid point, and I did not catch that in your earlier post. 

My apologies for the misunderstanding...

--brett

p.s. it's been a bad day...


On Fri, 19 Oct 2001 09:42:52 -0500 (CDT)
Brian McCane <bmccane@mccons.net> wrote:

> Brett,
>
>   It appears that I may have upset you somehow.  I didn't intend to.  I
> chose to write the PERL script to place them in a transaction to help
> improve performance a little bit on each pass.  By building and tearing
> down a connection to the postmaster everytime you send an update, you will
> be putting a much heavier load on the server than is necessary.  If I was
> using the script myself, I would make the connection to the postmaster
> once at the top of the script, and then re-use it with a transaction block
> for each group of 10/20.  Then shut it down at the end.  This way I do not
> suffer any additional protocol overhead each time I want to use it.
>
> - brian
>
>
> On Thu, 18 Oct 2001, Brett Schwarz wrote:
>
> > ????
> >
> > This is why I said this:
> > > > adjust the sleep to your own specs.
> >
> > If he wants to "batch" them, then all he needs to do is add a condition/count:
> >
> > --------------------------------
> > #!/bin/sh
> >
> > cnt=1
> > while read line; do
> >
> >     # "batches" of 20 or whatever you *want*
> >     if [ "X$cnt" == "X20" ]; then
> >         sleep 5   # whatever you *want*
> >         cnt=0
> >     fi
> >
> >     psql dbname -c "UPDATE ... $line ..."
> >     let cnt=$cnt+1
> >
> > done < filename
> >
> > ---------------------------------
> >
> > I will leave it up to the reader to modify this to his/her *specific* need...
> >
> > --brett
> >
> >
> >
> >
> > On Thu, 18 Oct 2001 20:07:32 -0500 (CDT)
> > Brian McCane <bmccane@mccons.net> wrote:
> >
> > > Much too quick and dirty.  With 300,000 updates this will take 1.5million
> > > seconds, roughly 200 days.  Even at a rate of 1 a second it will take
> > > nearly four days.  All this aside, I would probably use a perl script and
> > > do batches of about 10-20, with a sleep.  You can try something like the
> > > following and feed it using standard input.  Please NOTE, I didn't test
> > > this, and even if I did, you shouldn't trust me :).
> > >
> > > - brian
> > >
> > > Wm. Brian McCane                    | Life is full of doors that won't open
> > > Search http://recall.maxbaud.net/   | when you knock, equally spaced amid those
> > > Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
> > > Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"
> > >
> > > #! /usr/bin/perl
> > >
> > > use strict ;
> > >
> > > my @updates ;
> > >
> > > while (my $line = <STDIN>)
> > > {
> > >   my $cmd = "UPDATE table SET ......." ;    # Use the line we just read
> > >   push(@updates, $cmd) ;
> > >   if (@updates == 10)
> > >   {
> > >     do_update() ;
> > >     sleep 5 ;
> > >     @updates = () ;                # Redundant
> > >   }
> > > }
> > > do_update if (@updates) ;
> > >
> > > sub do_update
> > > {
> > >   open PSQL, "| psql dbname" ;
> > >   print PSQL "begin transaction ;\n" ;
> > >   while (my $cmd = shift @updates)
> > >   {
> > >     print PSQL $cmd . "\n" ;
> > >   }
> > >   print PSQL "commit ;\n" ;
> > >   close PSQL ;
> > > }
> > >
> > > __END__
> > >
> > >
> > > On Thu, 18 Oct 2001, Brett Schwarz wrote:
> > >
> > > > A quick and dirty solution would be something like (untested):
> > > >
> > > > while read line; do
> > > >     psql dbname -c "update table  ... use 'line' as your id and data"
> > > >     sleep 5
> > > > done < filename
> > > >
> > > > where "filename" is the file with the ids in it. "dbname" is the name of your DB (of course!)
> > > >
> > > > I am not sure what your file looks like, but you can do something like this. It could be easy, or harder,
dependingon what your file has. 
> > > >
> > > > adjust the sleep to your own specs.
> > > >
> > > > --brett
> > > >
> > > >
> > > > P.S.
> > > >
> > > > You could test this out by doing:
> > > >
> > > > while read line; do
> > > >     psql dbname -c "select * from table where id=$line"
> > > >     sleep 5
> > > > done < filename
> > > >
> > > >
> > > >
> > > > On Thu, 18 Oct 2001 16:35:01 +1000 (EST)
> > > > Mr OCP <mr_ocp@yahoo.com> wrote:
> > > >
> > > > > Hi Folks
> > > > >
> > > > > We have about 300,000 records to update, I have
> > > > > created a file with the ids of the records to be
> > > > > updated, since the number of records to be updated is
> > > > > very large, the update statement takes forever and
> > > > > server slows down dramatically, therefore I want to do
> > > > > one at a time.
> > > > >
> > > > > Do you guys have any script which may update records
> > > > > one by one using the file that lists these records. I
> > > > > am running postgresql sever under Unix, any kind of
> > > > > script would do, whether its a perl, bash or sql
> > > > > whatever script
> > > > >
> > > > > Thanks for your help
> > > > > Mike
> > > > >
> > > > > http://briefcase.yahoo.com.au - Yahoo! Briefcase
> > > > > - Manage your files online.
> > > > >
> > > > > ---------------------------(end of broadcast)---------------------------
> > > > > TIP 5: Have you checked our extensive FAQ?
> > > > >
> > > > > http://www.postgresql.org/users-lounge/docs/faq.html
> > > >
> > > > _________________________________________________________
> > > > Do You Yahoo!?
> > > > Get your free @yahoo.com address at http://mail.yahoo.com
> > > >
> > > >
> > > > ---------------------------(end of broadcast)---------------------------
> > > > TIP 3: if posting/reading through Usenet, please send an appropriate
> > > > subscribe-nomail command to majordomo@postgresql.org so that your
> > > > message can get through to the mailing list cleanly
> > > >
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> >
> > _________________________________________________________
> > Do You Yahoo!?
> > Get your free @yahoo.com address at http://mail.yahoo.com
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com