Re: Very long deletion time on a 200 GB database - Mailing list pgsql-performance

From Reuven M. Lerner
Subject Re: Very long deletion time on a 200 GB database
Date
Msg-id 4F4B39EF.2080004@lerner.co.il
Whole thread Raw
In response to Re: Very long deletion time on a 200 GB database  (Steve Crawford <scrawford@pinpointresearch.com>)
Responses Re: Very long deletion time on a 200 GB database
Re: Very long deletion time on a 200 GB database
List pgsql-performance
Hi, everyone.  I wanted to thank you again for your help on the huge
delete problem that I was experiencing.

After a lot of trial and error, we finally came to the conclusion that
deleting this much data in the time frame that they need, on
underpowered hardware that is shared with an application, with each test
iteration taking 5-9 hours to run (but needing to run in 2-3), is just
not going to happen.  We tried many of the options that people helpfully
suggested here, but none of them gave us the performance that we needed.

(One of the developers kept asking me how it can possibly take so long
to delete 200 GB, when he can delete files of that size in much less
time.  I had to explain to him that deleting rows from a database, is a
far more complicated task, and can't really be compared to deleting a
few files.)

In the end, it was agreed that we could execute the deletes over time,
deleting items in the background, or in parallel with the application's
work.  After all, if the disk is filling up at the rate of 2 GB/day,
then so long as we delete 4 GB/day (which is pretty easy to do), we
should be fine.  Adding RAM or another disk are simply out of the
question, which is really a shame for a database of this size.

I should add that it was interesting/amusing to see the difference
between the Unix and Windows philosophies.  Each time I would update my
pl/pgsql functions, the Windows guys would wrap it into a string, inside
of a .NET program, which then needed to be compiled, installed, and run.
  (Adding enormous overhead to our already long testing procedure.)  I
finally managed to show them that we could get equivalent functionality,
with way less overhead, by just running psql -f FILENAME.  This version
doesn't have fancy GUI output, but it works just fine...

I always tell people that PostgreSQL is not just a great database, but a
fantastic, helpful community.  Thanks to everyone for their suggestions
and advice.

Reuven

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?
Next
From: Wales Wang
Date:
Subject: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?