Thread: delete is getting hung when there is a huge data in table

delete is getting hung when there is a huge data in table

From
Mitu Verma
Date:
Hi,

I am facing an issue with the deletion of huge data.
We have a cronscript which is used to delete the data of last 3 months from one of the tables.
Data in the table is large (8872597 as you can see the count below) since it is from last 3 months.

fm_db_Server3=# select count(*) from audittraillogentry ;


  count
---------
8872597
(1 row)

Now issue is that when this script for the deletion of data is launched , it is taking more than 7 days and doing
nothingi.e not a single row has been deleted. 

Then we stopped the script,terminated the database sessions by using SELECT pg_terminate_backend(proc pid) and run the
followingcommand 

delete from audittraillogentry where intime <= to_timestamp('2015-01-30 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3') OR
outtime <= to_timestamp('2015-01-30 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3'); 
still this delete operation is not working and not a single row has been deleted from the table.

Now i have following questions -

1. If postgreSQL has some limitations for deletion of large data?
2. If i should run the vacumm, after stopping the cron script ? because probably to get the "smaller" table?
3. if dropping the indexes can help here? now sure.
4.if i should think about partitioning , if there is any limitation while delaing with large data in postgreSQL?

regards
Mitu

_____
___________________________________


Re: delete is getting hung when there is a huge data in table

From
Tom Lane
Date:
Mitu Verma <mitu.verma@ericsson.com> writes:
> 1. If postgreSQL has some limitations for deletion of large data?

Not as such, but you've not given us any details that would permit
comment.

A reasonably likely bet is that this table is referenced by a foreign key
in some other table, and that other table has no index on the referencing
column.  That would make the FK is-it-ok-to-delete checks very slow.

            regards, tom lane


Re: delete is getting hung when there is a huge data in table

From
"David G. Johnston"
Date:
On Saturday, May 2, 2015, Mitu Verma <mitu.verma@ericsson.com> wrote:

still this delete operation is not working and not a single row has been deleted from the table.


Because of MVCC other sessions are not able to see partial deletions...and as you aluded to knowing the data itself is not actually removed by a deletion though a vacuum will eventually allow the same space to be reused thus deferring the increase in size from future insertions.

David J. 

Re: delete is getting hung when there is a huge data in table

From
Uwe Schroeder
Date:
This delete runs in a single transaction. That means the entire transaction
has to complete before you will see anything deleted. Interrupting the
transaction simply rolls it back, so nothing is deleted.
Tom already pointed out the potential foreign key slowdown, another slowdown
may simply be drive speed.

My recommendation: cut the delete in chunks. For example delete the data one
week at a time. That way the transaction is smaller, the dataset to delete is
smaller and it will finish quicker.

Uwe



On Sun, May 03, 2015 03:24:25 AM Mitu Verma wrote:
> Hi,
>
> I am facing an issue with the deletion of huge data.
> We have a cronscript which is used to delete the data of last 3 months from
> one of the tables. Data in the table is large (8872597 as you can see the
> count below) since it is from last 3 months.
>
> fm_db_Server3=# select count(*) from audittraillogentry ;
>
>
>   count
> ---------
> 8872597
> (1 row)
>
> Now issue is that when this script for the deletion of data is launched , it
> is taking more than 7 days and doing nothing i.e not a single row has been
> deleted.
>
> Then we stopped the script,terminated the database sessions by using SELECT
> pg_terminate_backend(proc pid) and run the following command
>
> delete from audittraillogentry where intime <= to_timestamp('2015-01-30
> 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3') OR outtime  <=
> to_timestamp('2015-01-30 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3'); still
> this delete operation is not working and not a single row has been deleted
> from the table.
>
> Now i have following questions -
>
> 1. If postgreSQL has some limitations for deletion of large data?
> 2. If i should run the vacumm, after stopping the cron script ? because
> probably to get the "smaller" table? 3. if dropping the indexes can help
> here? now sure.
> 4.if i should think about partitioning , if there is any limitation while
> delaing with large data in postgreSQL?
>
> regards
> Mitu
>
> _____
> ___________________________________



Re: delete is getting hung when there is a huge data in table

From
Andomar
Date:
 > Now issue is that when this script for the deletion of data is
launched , it is taking more than 7 days and doing nothing i.e not a
single row has been deleted.

Deleting a large number of rows can take a long time.  Often it's
quicker to delete smaller chunks.  The LIMIT clause is not supported by
DELETE, so you need some kind of subquery.

We use something like:

do $_$declare
     num_rows bigint;
begin
     loop
         delete from YourTable where id in
             (select id from YourTable where id < 500 limit 100);
         get diagnostics num_rows = row_count;
         raise notice 'deleted % rows', num_rows;
         exit when num_rows = 0;
     end loop;
end;$_$;

This deletes rows with an id smaller than 500 in chunks of 100.

Kind regards,
Andomar


Re: delete is getting hung when there is a huge data in table

From
Fabio Ugo Venchiarutti
Date:
> still this delete operation is not working and not a single row has
been deleted from the table.

This is an intended effect of ACID properties compliance = transactions
are guaranteed to atomically show their effects upon completion or do
nothing at all. This also applies to multiple data-changing statements
if they're wrapped into a transaction.
http://en.wikipedia.org/wiki/ACID. Postgres just fully implements that
through http://en.wikipedia.org/wiki/Multiversion_concurrency_control



As to why it takes forever, you're not providing enough information for
a solid hypothesis.

Speculation (8 million rows is not that big):

- very slow and database-unfriendly storage stack (RAID 5 perhaps?)
- missing indexes to match your clauses (on modern hardware a full table
scan over 8 million rows should complete in the order of minutes. Try to
run a select with the same clauses)
- have you checked if there are other processes holding exclusive locks
on the table/some of the records you're trying to delete? pg_lock and
pg_stat_activity are your friends here
- are there triggers on the table that could be slow/hitting locks in
other tables?


Regardless, you might want to split your statement into smaller chunks
through less greedy clauses and see what happens.


Regards


F





On 03/05/15 15:24, Mitu Verma wrote:
> Hi,
>
> I am facing an issue with the deletion of huge data.
> We have a cronscript which is used to delete the data of last 3 months from one of the tables.
> Data in the table is large (8872597 as you can see the count below) since it is from last 3 months.
>
> fm_db_Server3=# select count(*) from audittraillogentry ;
>
>
>    count
> ---------
> 8872597
> (1 row)
>
> Now issue is that when this script for the deletion of data is launched , it is taking more than 7 days and doing
nothingi.e not a single row has been deleted. 
>
> Then we stopped the script,terminated the database sessions by using SELECT pg_terminate_backend(proc pid) and run
thefollowing command 
>
> delete from audittraillogentry where intime <= to_timestamp('2015-01-30 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3')
ORouttime  <= to_timestamp('2015-01-30 23:59:59.999', 'YYYY/MM/DD-HH24:MI:SS.FF3'); 
> still this delete operation is not working and not a single row has been deleted from the table.
>
> Now i have following questions -
>
> 1. If postgreSQL has some limitations for deletion of large data?
> 2. If i should run the vacumm, after stopping the cron script ? because probably to get the "smaller" table?
> 3. if dropping the indexes can help here? now sure.
> 4.if i should think about partitioning , if there is any limitation while delaing with large data in postgreSQL?
>
> regards
> Mitu
>
> _____
> ___________________________________
>
>


Re: delete is getting hung when there is a huge data in table

From
Mitu Verma
Date:
Thank you so much all of you.

Table audittraillogentry have PRIMARY KEY and FOREIGN KEY defined, below is the detail of existing table
audittraillogentry.

As you can see ., it is referenced by 2 tables , "cdrdetails" and "cdrlogentry" . "cdrdetails" table do not have the
indexwhereas "cdrlogentry" has the index. 
Now  after creating the index on "cdrdetails", deletion has become fast, 12 lakh records are deleted in 16 minutes,
whichis a drastic improvement in performance. 
Before indexing deletion of 500 records were taking  ~2 minutes.

fm_db_Server1-> \d+ audittraillogentry
                             Table "mmsuper.audittraillogentry"
          Column          |            Type             | Modifiers | Storage  | Description
--------------------------+-----------------------------+-----------+----------+-------------
 event                    | smallint                    |           | plain    |
 innodeid                 | character varying(80)       |           | extended |
 innodename               | character varying(80)       |           | extended |
 sourceid                 | character varying(300)      |           | extended |
 intime                   | timestamp without time zone |           | plain    |
 outnodeid                | character varying(80)       |           | extended |
 outnodename              | character varying(80)       |           | extended |
 destinationid            | character varying(300)      |           | extended |
 outtime                  | timestamp without time zone |           | plain    |
 bytes                    | bigint                      |           | plain    |
 cdrs                     | bigint                      |           | plain    |
 tableindex               | bigint                      | not null  | plain    |
 noofsubfilesinfile       | bigint                      |           | plain    |
 recordsequencenumberlist | character varying(1000)     |           | extended |
Indexes:
    "audittraillogentry_pkey" PRIMARY KEY, btree (tableindex), tablespace "mmdata"
    "audit_destid_index" btree (destinationid), tablespace "mmindex"
    "audit_intime_index" btree (intime DESC), tablespace "mmindex"
    "audit_outtime_index" btree (outtime DESC), tablespace "mmindex"
    "audit_sourceid_index" btree (sourceid), tablespace "mmindex"
Referenced by:
    TABLE "cdrdetails" CONSTRAINT "audittableindex_fkey" FOREIGN KEY (audittableindex) REFERENCES
audittraillogentry(tableindex)ON DELETE CASCADE 
    TABLE "cdrlogentry" CONSTRAINT "cdrlogentry_audittableindex_fkey" FOREIGN KEY (audittableindex) REFERENCES
audittraillogentry(tableindex)
Has OIDs: no
Tablespace: "mmdata"


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: May 03, 2015 9:43 AM
To: Mitu Verma
Cc: 'pgsql-general@postgresql.org' (pgsql-general@postgresql.org)
Subject: Re: [GENERAL] delete is getting hung when there is a huge data in table

Mitu Verma <mitu.verma@ericsson.com> writes:
> 1. If postgreSQL has some limitations for deletion of large data?

Not as such, but you've not given us any details that would permit comment.

A reasonably likely bet is that this table is referenced by a foreign key in some other table, and that other table has
noindex on the referencing column.  That would make the FK is-it-ok-to-delete checks very slow. 

            regards, tom lane


Re: delete is getting hung when there is a huge data in table

From
Merlin Moncure
Date:
On Wed, May 6, 2015 at 1:56 AM, Mitu Verma <mitu.verma@ericsson.com> wrote:
> Thank you so much all of you.
>
> Table audittraillogentry have PRIMARY KEY and FOREIGN KEY defined, below is the detail of existing table
audittraillogentry.
>
> As you can see ., it is referenced by 2 tables , "cdrdetails" and "cdrlogentry" . "cdrdetails" table do not have the
indexwhereas "cdrlogentry" has the index. 
> Now  after creating the index on "cdrdetails", deletion has become fast, 12 lakh records are deleted in 16 minutes,
whichis a drastic improvement in performance. 
> Before indexing deletion of 500 records were taking  ~2 minutes.

Yeah, this (unindexed foreign key causing slow deletes) is probably
the #1 performance gotcha in SQL.

If you're often doing very large deletes, sometimes it can help to
attempt to work out a better strategy, perhaps one of:

*) using TRUNCATE...CASADE
*) table partitioning organized such that you can drop a partition to
delete rows
*) temporarily disabling RI during large deletes (can be dangerous and
but in certain limited cases can be useful).

merlin