Thread: external query VS user function
Hello all, I'm trying to optimize the execution of a query which deletes a big amount of records based on time I need to remove from 100.000 to 1.000.000 records from my table once a day, and I'dd like to make that removal as fast as possible. This is the idea: DELETE FROM tt WHERE time < $1; Would it be considerably faster if I declare that query inside a user function, let's say function_delete(integer), and invoque it instead SELECT function_delete($max_time); Would this second approach be faster ? I imagine there could be some internal mechanism that would allow pg to have that query pre-optimized somehow ? thanks for the feedback. Joao [[[ I've been looking at Partitioning, but it seems to be a quite intrusive setup if you already have your system up and running]]]
Hello 2008/9/11 Joao Ferreira gmail <joao.miguel.c.ferreira@gmail.com>: > Hello all, > > I'm trying to optimize the execution of a query which deletes a big > amount of records based on time > > I need to remove from 100.000 to 1.000.000 records from my table once a > day, and I'dd like to make that removal as fast as possible. This is the > idea: > > DELETE FROM tt WHERE time < $1; > > > Would it be considerably faster if I declare that query inside a user > function, let's say function_delete(integer), and invoque it instead > > SELECT function_delete($max_time); > > using function hasn't any positive effect there. Evaluation is same, regards Pavel Stehule > Would this second approach be faster ? I imagine there could be some > internal mechanism that would allow pg to have that query pre-optimized > somehow ? > > thanks for the feedback. > > Joao > > [[[ I've been looking at Partitioning, but it seems to be a quite > intrusive setup if you already have your system up and running]]] > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Joao Ferreira gmail wrote: > I need to remove from 100.000 to 1.000.000 records from my table once a > day, and I'dd like to make that removal as fast as possible. This is the > idea: > > DELETE FROM tt WHERE time < $1; > > > Would it be considerably faster if I declare that query inside a user > function, let's say function_delete(integer), and invoque it instead > > SELECT function_delete($max_time); > > > Would this second approach be faster ? I imagine there could be some > internal mechanism that would allow pg to have that query pre-optimized > somehow ? No, the internal mechanisms will be pretty much the same (assuming you write the function LANGUAGE SQL). The bottleneck in this command will most likely be the disk, so the conceivable "pre-optimization" mechanisms such as prepared statements or plan caching wouldn't really matter here in practice. So just use whatever you like best. > [[[ I've been looking at Partitioning, but it seems to be a quite > intrusive setup if you already have your system up and running]]] Partitioning is probably something you should look into if you can't get the statement to run quickly enough otherwise. Or combine partitioning and TRUNCATE to get superfast deletes.
On Thu, Sep 11, 2008 at 5:38 AM, Joao Ferreira gmail <joao.miguel.c.ferreira@gmail.com> wrote: > Hello all, > > I'm trying to optimize the execution of a query which deletes a big > amount of records based on time > > I need to remove from 100.000 to 1.000.000 records from my table once a > day, and I'dd like to make that removal as fast as possible. This is the > idea: > > DELETE FROM tt WHERE time < $1; > > > Would it be considerably faster if I declare that query inside a user > function, let's say function_delete(integer), and invoque it instead > > SELECT function_delete($max_time); > > > Would this second approach be faster ? I imagine there could be some > internal mechanism that would allow pg to have that query pre-optimized > somehow ? This scenario might be a good candidate for a partitioning/rotation strategy. You might want to read up on this in the docs...but the basic idea is that the database presents a bunch of small tables with identical structure as a single table to the app...and when it's time to dump some records you instead issue 'drop table'. There's some setup work to do and some tradeoffs in terms of how you write queries that touch the table but it's usually a good strategy for tables that basically log data, grow quickly, and have to be rotated. merlin