Thread: make bulk deletes faster?

make bulk deletes faster?

From
James Klo
Date:
I have the following table:

CREATE TABLE timeblock
(
  timeblockid int8 NOT NULL,
  starttime timestamp,
  endtime timestamp,
  duration int4,
  blocktypeid int8,
  domain_id int8,
  create_date timestamp,
  revision_date timestamp,
  scheduleid int8,
  CONSTRAINT timeblock_pkey PRIMARY KEY (timeblockid),
  CONSTRAINT fk25629e03312570b FOREIGN KEY (blocktypeid)
      REFERENCES blocktype (blocktypeid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fk25629e09be84177 FOREIGN KEY (domain_id)
      REFERENCES wa_common_domain (domain_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH OIDS;

CREATE INDEX timeblock_blocktype_idx
  ON timeblock
  USING btree
  (blocktypeid);

CREATE INDEX timeblock_date_idx
  ON timeblock
  USING btree
  (starttime, endtime);

CREATE INDEX timeblockepoch_idx
  ON timeblock
  USING btree
  (date_trunc('minute'::text, starttime), (date_part('epoch'::text,
date_trunc('minute'::text, starttime)) * 1000::double precision),
date_trunc('minute'::text, endtime), (date_part('epoch'::text,
date_trunc('minute'::text, endtime)) * 1000::double precision));

CREATE INDEX timeblockhourmin_idx
  ON timeblock
  USING btree
  (date_part('hour'::text, starttime), date_part('minute'::text,
starttime), date_part('hour'::text, endtime), date_part('minute'::text,
endtime));

CREATE INDEX timeblockid_idx
  ON timeblock
  USING btree
  (timeblockid);


There are also indexes on wa_common_domain and blocktype on pkeys.

explain analyze delete from timeblock where timeblockid = 666666

Index Scan using timeblockid_idx on timeblock  (cost=0.00..5.28 rows=1
width=6) (actual time=0.022..0.022 rows=0 loops=1)
  Index Cond: (timeblockid = 666666)
Total runtime: 0.069 ms


I need to routinely move data from the timeblock table to an archive
table with the same schema named timeblock_archive.  I really need this
to happen as quickly as possible, as the archive operation appears to
really tax the db server...

I'd like some suggestions on how to get the deletes to happen faster, as
while deleting individually appears to extremely fast, when I go to
delete lots of rows the operation takes an extremely long time to
complete (5000 rows takes about 3 minutes, 1000000 rows takes almost
close to 4 hours or more depending upon server load; wall time btw).

i've tried several different approaches doing the delete and I can't
seem to make it much faster... anyone have any ideas?

The approaches I've taken both use a temp table to define the set that
needs to be deleted.

Here's what I've tried:

Attempt 1:
----------
delete from timeblock where timeblockid in (select timeblockid from
timeblock_tmp)


Attempt 2:
----------
num_to_delete := (select count(1) from tmp_timeblock);
RAISE DEBUG 'archiveDailyData(%): need to delete from timeblock [%
rows]', timestart, num_to_delete;
cur_offset := 0;
while cur_offset < num_to_delete loop
   delete from timeblock where timeblockid in
      (select timeblockid from
         tmp_timeblock limit 100 offset cur_offset);
   get diagnostics num_affected = ROW_COUNT;
   RAISE DEBUG 'archiveDailyData(%): delete from timeblock [% rows]
cur_offset = %', timestart, num_affected, cur_offset;
      cur_offset := cur_offset + 100;
end loop;


Attempt 3:
----------
   num_to_delete := (select count(1) from tmp_timeblock);
   cur_offset := num_to_delete;
   RAISE DEBUG 'archiveDailyData(%): need to delete from timeblock [%
rows]', timestart, num_to_delete;
   open del_cursor for select timeblockid from tmp_timeblock;
   loop
      fetch del_cursor into del_pkey;
      if not found then
         exit;
      else
         delete from timeblock where timeblockid = del_pkey;
         get diagnostics num_affected = ROW_COUNT;
         cur_offset := cur_offset - num_affected;
         if cur_offset % 1000 = 0 then
            RAISE DEBUG 'archiveDailyData(%): delete from timeblock [%
left]', timestart, cur_offset;
         end if;
      end if;
   end loop;
   close del_cursor;


I've considered using min(starttime) and max(starttime) from the temp
table and doing:

delete from timeblock where starttime between min and max;

however, I'm concerned about leaving orphan data, deleting too much data
running into foreign key conflicts, etc.

dropping the indexes on timeblock could be bad, as this table recieves
has a high volume on reads, inserts & updates.

Any one have any suggestions?

Thanks,

Jim K

Re: make bulk deletes faster?

From
Michael Fuhr
Date:
On Sat, Dec 17, 2005 at 09:10:40PM -0800, James Klo wrote:
> I'd like some suggestions on how to get the deletes to happen faster, as
> while deleting individually appears to extremely fast, when I go to
> delete lots of rows the operation takes an extremely long time to
> complete (5000 rows takes about 3 minutes, 1000000 rows takes almost
> close to 4 hours or more depending upon server load; wall time btw).

Those times do seem excessive -- do any other tables have foreign
key references to the table you're deleting from?  If so, do those
tables have indexes on the referring columns?  Does this table or
any referring table have triggers?  Also, are you regularly vacuuming
and analyzing your tables?  Have you examined pg_locks to see if
an unacquired lock might be slowing things down?

--
Michael Fuhr

Re: make bulk deletes faster?

From
James Klo
Date:
In article <20051219023616.GA89670@winnie.fuhr.org>,
 mike@fuhr.org (Michael Fuhr) wrote:

> On Sat, Dec 17, 2005 at 09:10:40PM -0800, James Klo wrote:
> > I'd like some suggestions on how to get the deletes to happen faster, as
> > while deleting individually appears to extremely fast, when I go to
> > delete lots of rows the operation takes an extremely long time to
> > complete (5000 rows takes about 3 minutes, 1000000 rows takes almost
> > close to 4 hours or more depending upon server load; wall time btw).
>
> Those times do seem excessive -- do any other tables have foreign
> key references to the table you're deleting from?  If so, do those
> tables have indexes on the referring columns?  Does this table or
> any referring table have triggers?  Also, are you regularly vacuuming
> and analyzing your tables?  Have you examined pg_locks to see if
> an unacquired lock might be slowing things down?

As the table was originally created using Hibernate, yes, there are
several key references, however I've already added indexes those tables
on referring keys.  There are no triggers, we were running
pg_autovaccum, but found that it wasn't completing.  I believe we
disabled, and are now running a cron every 4 hours. My archiving method,
is also running analyze - as I figure after a mass deletes, it would
probably keep query speeds from degrading.)

I've looked at pg_locks, but not sure I understand quite how to use it
to determine if there are unacquired locks.  I do know that we
occasionally get some warnings from C3P0 that states it detects a
deadlock, and allocates emergency threads.

BTW, If I didn't mention, we are using PG 8.1 on Red Hat Enterprise, 4GB
RAM, 4 dual-core CPUs, think its RAID5 (looks like what I would consider
typical Linux partitioning /, /tmp, /usr, /var, /boot, /home).  After
trolling the archives, and doing a bit of sleuthing on the DB, I'm lead
to believe that this is more or less a default install of PG 8.1. As I'm
relatively new to PG, I'm not sure how it should be configured for our
setup.  I would suspect that this could probably effect the speed of
deletes (and queries as well).

Thanks for any help you can provide.

Re: make bulk deletes faster?

From
Mitch Skinner
Date:
On Sat, 2005-12-17 at 21:10 -0800, James Klo wrote:
> I need to routinely move data from the timeblock table to an archive
> table with the same schema named timeblock_archive.  I really need this
> to happen as quickly as possible, as the archive operation appears to
> really tax the db server...

Have you considered partitioning?

http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html

If you can partition your timeblock table so that you archive an entire
partition at a time, then you can delete the archived rows by just
dropping (or truncating) that partition.  AFAIK there's no way to
"re-parent" a partition (e.g., from the timeblock table to the
timeblock_archive table).

If your app is particularly cooperative you might be able to use
partitioning to avoid moving data around entirely.  If table accesses
are always qualified by something you can use as a partitioning key,
then partitioning can give you the speed benefits of a small table
without the effort of keeping it cleared out.

Another good read, if you haven't yet, is
http://powerpostgresql.com/Downloads/annotated_conf_80.html
especially the "Memory", "Checkpoints", and maybe "WAL options"
sections.  If you're doing large deletes then you may need to increase
your free space map settings--if a VACUUM VERBOSE finishes by saying
that you need more FSM pages, then the table may have gotten bloated
over time (which can be fixed with a configuration change and a VACUUM
FULL, though this will lock everything else out of the table while it's
running).

Mitch


Re: make bulk deletes faster?

From
Ang Chin Han
Date:
On 12/18/05, James Klo <jklo@arkitec.com> wrote:
> explain analyze delete from timeblock where timeblockid = 666666
>
> Index Scan using timeblockid_idx on timeblock  (cost=0.00..5.28 rows=1
> width=6) (actual time=0.022..0.022 rows=0 loops=1)
>   Index Cond: (timeblockid = 666666)
> Total runtime: 0.069 ms
... snip ...
> Here's what I've tried:
>
> Attempt 1:
> ----------
> delete from timeblock where timeblockid in (select timeblockid from
> timeblock_tmp)

The DELETE in Attempt 1 contains a join, so if this is the way you're
mainly specifying which rows to delete, you'll have to take into
account how efficient the join of timeblock and timeblock_tmp is. What
does

EXPLAIN ANALYZE select * from timeblock where timeblockid in (select
timeblockid from timeblock_tmp)

or

EXPLAIN ANALYZE delete from timeblock where timeblockid in (select
timeblockid from timeblock_tmp)

say?

You *should* at least get a "Hash IN join" for the outer loop, and
just one Seq scan on timeblock_tmp. Otherwise, consider increasing
your sort_mem (postgresql 7.x) or work_mem (postgresql 8.x) settings.
Another alternative is to reduce the amount of rows being archive at
one go to fit in the amount of sort_mem or work_mem that allows the
"Hash IN Join" plan. See
http://www.postgresql.org/docs/8.1/static/runtime-config-resource.html#GUC-WORK-MEM

On the other hand, PostgreSQL 8.1's partitioning sounds like a better
long term solution that you might want to look into.

Re: make bulk deletes faster?

From
James Klo
Date:
Mitch Skinner wrote:

> Have you considered partitioning?
>
> http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html
>
> If you can partition your timeblock table so that you archive an entire
> partition at a time, then you can delete the archived rows by just
> dropping (or truncating) that partition.  AFAIK there's no way to
> "re-parent" a partition (e.g., from the timeblock table to the
> timeblock_archive table).
>
> If your app is particularly cooperative you might be able to use
> partitioning to avoid moving data around entirely.  If table accesses
> are always qualified by something you can use as a partitioning key,
> then partitioning can give you the speed benefits of a small table
> without the effort of keeping it cleared out.

Yes, I've considered partitioning as a long term change. I was thinking
about this for other reasons - mainly performance.  If I go the
partitioning route, would I need to even perform archival?

The larger problem that I need to solve is really twofold:

1. Need to keep reads on timeblocks that are from the current day
through the following seven days very fast, especially current day reads.

2. Need to be able to maintain the timeblocks for reporting purposes,
for at least a year (potentially more).  This could probably better
handled performing aggregate analysis, but this isn't on my current radar.

> Another good read, if you haven't yet, is
> http://powerpostgresql.com/Downloads/annotated_conf_80.html
> especially the "Memory", "Checkpoints", and maybe "WAL options"
> sections.  If you're doing large deletes then you may need to increase
> your free space map settings--if a VACUUM VERBOSE finishes by saying
> that you need more FSM pages, then the table may have gotten bloated
> over time (which can be fixed with a configuration change and a VACUUM
> FULL, though this will lock everything else out of the table while it's
> running).
>

Thanks, I will look into this as well.

Re: make bulk deletes faster?

From
"Jim C. Nasby"
Date:
On Mon, Dec 19, 2005 at 11:10:50AM -0800, James Klo wrote:
> Yes, I've considered partitioning as a long term change. I was thinking
> about this for other reasons - mainly performance.  If I go the
> partitioning route, would I need to even perform archival?

No. The idea is that you have your table split up into date ranges
(perhaps each week gets it's own table). IE: table_2005w01,
table_2005w02, etc. You can do this with either inheritence or
individual tables and a UNION ALL view. In your case, inheritence is
probably the better way to go.

Now, if you have everything broken down by weeks and you typically only
need to access 7 days worth of data, then generally you will only be
reading from two tables, so those two tables should stay in memory, and
indexes on them will be smaller. If desired, you can also play tricks on
the older tables surch as vacuum full or cluster to further reduce space
usage and improve performance.

> The larger problem that I need to solve is really twofold:
>
> 1. Need to keep reads on timeblocks that are from the current day
> through the following seven days very fast, especially current day reads.
>
> 2. Need to be able to maintain the timeblocks for reporting purposes,
> for at least a year (potentially more).  This could probably better
> handled performing aggregate analysis, but this isn't on my current radar.

I've written an RRD-like implementation in SQL that might interest you;
it's at http://rrs.decibel.org (though the svn web access appears to be
down right now...)
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461