Thread: Re: [ADMIN] Fast Deletion For Large Tables

Re: [ADMIN] Fast Deletion For Large Tables

From
"Nikolaus Dilger"
Date:
Raymond,

Partitioned tables would solve your issue since you
could just truncate a partiotion in order to delete the
unneeded data.  Unfortunately they are not available in
PostgreSQL.  But maybe in a future release.

Unfortunately you also cannot use a UNION ALL in a view
to fake partitions.

So only several changes to your applicatin would allow
you to partion your data.  But depending on your
situation it may be worthwhile.  E.g. if you always
want to keep one weeks worth of data and purge a day at
a time you could have 7 tables instead of one.
table1_mon, table1_tue, table1_wed, etc.  And you could
select from all of them with a UNION ALL.

There is no rowid in PostgreSQL.  So while adding an
additional column with a sequence would help as long as
you have an index.  Just indexing your date column
maybe easier.  But then you pay with slower inserts or
the time and effort to build the index.

Regards,
Nikolaus Dilger

On Wed, 02 October 2002, Raymond Chui wrote:

>
>
> I have some tables with huge data.
> The tables have column timestamp and float.
> I am try to keep up to 6 day of data values.
> What I do is execute SQL below from crontab (UNIX to
> schedule commands).
>
> BEGIN;
> DELETE FROM table_1 WHERE column_time <
> ('now'::timestamp - '6
> days'::interval);
> .....
> DELETE FROM table_n WHERE column_time <
> ('now'::timestamp - '6
> days'::interval);
> COMMIT;
>
>
> Everything is running fine, except take long time to
> finish.
> Because some tables stored values from 50,000 to
> 100,000 rows
> Some deletion need to deleted up to 45,000 rows.
>
> So I am thinking just delete the rows by their row
> number or row ID,
> like
>
> DELETE FROM a_table WHERE row_id < 45000;
>
> I know there is row_id in Oracle.
> Is there row_id for a table in Postgres?
> Thank Q!

Re: [ADMIN] Fast Deletion For Large Tables

From
"Shridhar Daithankar"
Date:
On 5 Oct 2002 at 20:17, Nikolaus Dilger wrote:
> Partitioned tables would solve your issue since you
> could just truncate a partiotion in order to delete the
> unneeded data.  Unfortunately they are not available in
> PostgreSQL.  But maybe in a future release.

If you don't mind realigning your schema, inheriting tables can closely mimick
partitions upto certain extent.

Reading the original post below, I recommend you to consider inheritance
approach. Create a base table which is empty and create inherited partitions.
You can get all the data in child table by querying upon base table. So your
application would not require any modification as long as selecting data is
considered.

You have to write a wrapper on insert/update/delete code to operate upon a
particular partition.

> There is no rowid in PostgreSQL.  So while adding an
> additional column with a sequence would help as long as
> you have an index.  Just indexing your date column
> maybe easier.  But then you pay with slower inserts or
> the time and effort to build the index.

There is oid and if you don't use it you can disable it saving some space if
you have large data.

HTH

Bye
 Shridhar

--
Bachelor:    A man who chases women and never Mrs. one.


Re: [ADMIN] Fast Deletion For Large Tables

From
"Ross J. Reedstrom"
Date:
On Sat, Oct 05, 2002 at 08:17:38PM -0700, Nikolaus Dilger wrote:
> Raymond,
>
> Partitioned tables would solve your issue since you
> could just truncate a partiotion in order to delete the
> unneeded data.  Unfortunately they are not available in
> PostgreSQL.  But maybe in a future release.
>
> Unfortunately you also cannot use a UNION ALL in a view
> to fake partitions.
>

Hmm, you haven't tried this recently, have you? With pgsql 7.2.1,
It seems to work just fine:

test=# create view transactions as select * from monday union all select * from tuesday union all select * from
wednesday; 
CREATE
test=# \d transactions
      View "transactions"
  Column  |  Type   | Modifiers
----------+---------+-----------
 daynum   | integer |
 transact | integer |
View definition: (SELECT monday.daynum, monday.transact FROM monday UNION ALL SELECT tuesday.daynum, tuesday.transact
FROMtuesday) UNION ALL SELECT wednesday.daynum, wednesday.transact FROM wednesday; 

test=# select * from transactions;
 daynum | transact
--------+----------
      1 |        1
      1 |        2
      1 |        3
      2 |        4
      2 |        5
      2 |        6
(6 rows)

test=#

Ross