Thread: external query VS user function

external query VS user function

From
Joao Ferreira gmail
Date:
Hello all,

I'm trying to optimize the execution of a query which deletes a big
amount of records based on time

I need to remove from 100.000 to 1.000.000 records from my table once a
day, and I'dd like to make that removal as fast as possible. This is the
idea:

DELETE FROM tt WHERE time < $1;


Would it be considerably faster if I declare that query inside a user
function, let's say function_delete(integer), and invoque it instead

SELECT function_delete($max_time);


Would this second approach be faster ? I imagine there could be some
internal mechanism that would allow pg to have that query pre-optimized
somehow ?

thanks for the feedback.

Joao

[[[ I've been looking at Partitioning, but it seems to be a quite
intrusive setup if you already have your system up and running]]]



Re: external query VS user function

From
"Pavel Stehule"
Date:
Hello

2008/9/11 Joao Ferreira gmail <joao.miguel.c.ferreira@gmail.com>:
> Hello all,
>
> I'm trying to optimize the execution of a query which deletes a big
> amount of records based on time
>
> I need to remove from 100.000 to 1.000.000 records from my table once a
> day, and I'dd like to make that removal as fast as possible. This is the
> idea:
>
> DELETE FROM tt WHERE time < $1;
>
>
> Would it be considerably faster if I declare that query inside a user
> function, let's say function_delete(integer), and invoque it instead
>
> SELECT function_delete($max_time);
>
>

using function hasn't any positive effect there. Evaluation is same,

regards
Pavel Stehule

> Would this second approach be faster ? I imagine there could be some
> internal mechanism that would allow pg to have that query pre-optimized
> somehow ?
>
> thanks for the feedback.
>
> Joao
>
> [[[ I've been looking at Partitioning, but it seems to be a quite
> intrusive setup if you already have your system up and running]]]
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: external query VS user function

From
Peter Eisentraut
Date:
Joao Ferreira gmail wrote:
> I need to remove from 100.000 to 1.000.000 records from my table once a
> day, and I'dd like to make that removal as fast as possible. This is the
> idea:
>
> DELETE FROM tt WHERE time < $1;
>
>
> Would it be considerably faster if I declare that query inside a user
> function, let's say function_delete(integer), and invoque it instead
>
> SELECT function_delete($max_time);
>
>
> Would this second approach be faster ? I imagine there could be some
> internal mechanism that would allow pg to have that query pre-optimized
> somehow ?

No, the internal mechanisms will be pretty much the same (assuming you
write the function LANGUAGE SQL).  The bottleneck in this command will
most likely be the disk, so the conceivable "pre-optimization"
mechanisms such as prepared statements or plan caching wouldn't really
matter here in practice.  So just use whatever you like best.

> [[[ I've been looking at Partitioning, but it seems to be a quite
> intrusive setup if you already have your system up and running]]]

Partitioning is probably something you should look into if you can't get
the statement to run quickly enough otherwise.  Or combine partitioning
and TRUNCATE to get superfast deletes.

Re: external query VS user function

From
"Merlin Moncure"
Date:
On Thu, Sep 11, 2008 at 5:38 AM, Joao Ferreira gmail
<joao.miguel.c.ferreira@gmail.com> wrote:
> Hello all,
>
> I'm trying to optimize the execution of a query which deletes a big
> amount of records based on time
>
> I need to remove from 100.000 to 1.000.000 records from my table once a
> day, and I'dd like to make that removal as fast as possible. This is the
> idea:
>
> DELETE FROM tt WHERE time < $1;
>
>
> Would it be considerably faster if I declare that query inside a user
> function, let's say function_delete(integer), and invoque it instead
>
> SELECT function_delete($max_time);
>
>
> Would this second approach be faster ? I imagine there could be some
> internal mechanism that would allow pg to have that query pre-optimized
> somehow ?

This scenario might be a good candidate for a partitioning/rotation
strategy.  You might want to read up on this in the docs...but the
basic idea is that the database presents a bunch of small tables with
identical structure as a single table to the app...and when it's time
to dump some records you instead issue 'drop table'.

There's some setup work to do and some tradeoffs in terms of how you
write queries that touch the table but it's usually a good strategy
for tables that basically log data, grow quickly, and have to be
rotated.

merlin