external query VS user function

Started by Joao Ferreira gmailover 17 years ago4 messagesgeneral

joao.miguel.c.ferreira@gmail.com

over 17 years ago

Hello all,

I'm trying to optimize the execution of a query which deletes a big
amount of records based on time

I need to remove from 100.000 to 1.000.000 records from my table once a
day, and I'dd like to make that removal as fast as possible. This is the
idea:

DELETE FROM tt WHERE time < $1;

Would it be considerably faster if I declare that query inside a user
function, let's say function_delete(integer), and invoque it instead

SELECT function_delete($max_time);

Would this second approach be faster ? I imagine there could be some
internal mechanism that would allow pg to have that query pre-optimized
somehow ?

thanks for the feedback.

Joao

[[[ I've been looking at Partitioning, but it seems to be a quite
intrusive setup if you already have your system up and running]]]

Pavel Stehule

pavel.stehule@gmail.com

over 17 years ago

In reply to: Joao Ferreira gmail (#1)

Re: external query VS user function

Hello

2008/9/11 Joao Ferreira gmail <joao.miguel.c.ferreira@gmail.com>:

Hello all,

I'm trying to optimize the execution of a query which deletes a big
amount of records based on time

I need to remove from 100.000 to 1.000.000 records from my table once a
day, and I'dd like to make that removal as fast as possible. This is the
idea:

DELETE FROM tt WHERE time < $1;

Would it be considerably faster if I declare that query inside a user
function, let's say function_delete(integer), and invoque it instead

SELECT function_delete($max_time);

using function hasn't any positive effect there. Evaluation is same,

regards
Pavel Stehule

Show quoted text

Would this second approach be faster ? I imagine there could be some
internal mechanism that would allow pg to have that query pre-optimized
somehow ?

thanks for the feedback.

Joao

[[[ I've been looking at Partitioning, but it seems to be a quite
intrusive setup if you already have your system up and running]]]

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Peter Eisentraut

peter_e@gmx.net

over 17 years ago

In reply to: Joao Ferreira gmail (#1)

Re: external query VS user function

Joao Ferreira gmail wrote:

I need to remove from 100.000 to 1.000.000 records from my table once a
day, and I'dd like to make that removal as fast as possible. This is the
idea:

DELETE FROM tt WHERE time < $1;

Would it be considerably faster if I declare that query inside a user
function, let's say function_delete(integer), and invoque it instead

SELECT function_delete($max_time);

Would this second approach be faster ? I imagine there could be some
internal mechanism that would allow pg to have that query pre-optimized
somehow ?

No, the internal mechanisms will be pretty much the same (assuming you
write the function LANGUAGE SQL). The bottleneck in this command will
most likely be the disk, so the conceivable "pre-optimization"
mechanisms such as prepared statements or plan caching wouldn't really
matter here in practice. So just use whatever you like best.

[[[ I've been looking at Partitioning, but it seems to be a quite
intrusive setup if you already have your system up and running]]]

Partitioning is probably something you should look into if you can't get
the statement to run quickly enough otherwise. Or combine partitioning
and TRUNCATE to get superfast deletes.

Merlin Moncure

mmoncure@gmail.com

over 17 years ago

In reply to: Joao Ferreira gmail (#1)

Re: external query VS user function

On Thu, Sep 11, 2008 at 5:38 AM, Joao Ferreira gmail
<joao.miguel.c.ferreira@gmail.com> wrote:

Hello all,

I'm trying to optimize the execution of a query which deletes a big
amount of records based on time

I need to remove from 100.000 to 1.000.000 records from my table once a
day, and I'dd like to make that removal as fast as possible. This is the
idea:

DELETE FROM tt WHERE time < $1;

Would it be considerably faster if I declare that query inside a user
function, let's say function_delete(integer), and invoque it instead

SELECT function_delete($max_time);

Would this second approach be faster ? I imagine there could be some
internal mechanism that would allow pg to have that query pre-optimized
somehow ?

This scenario might be a good candidate for a partitioning/rotation
strategy. You might want to read up on this in the docs...but the
basic idea is that the database presents a bunch of small tables with
identical structure as a single table to the app...and when it's time
to dump some records you instead issue 'drop table'.

There's some setup work to do and some tradeoffs in terms of how you
write queries that touch the table but it's usually a good strategy
for tables that basically log data, grow quickly, and have to be
rotated.

merlin